Create Your Artistic Portrait with Multimodal Generative AI and NVIDIA Technologies
, Senior Solutions Architect, NVIDIA
, Senior Data Scientist, NVIDIA
Multimodal generative AI has recently seen significant advancements, enabling the creation of realistic images from textual or other inputs. Due to the complexity of these models, understanding how they function and how to apply them in practical settings can be challenging. We'll walk you through the concepts behind generative AI in visual content creation and review the evolution of these techniques and their application in a variety of industries. We'll focus on diffusion models and explain how they are trained and fine-tuned. For the hands-on, we will use a multimodal framework to fine-tune a pre-trained text-to-image model to generate a subject of your choice using a handful of images. We'll guide you through all the steps: preparing your dataset, training, inference, and visualization of the results. The workshop is intended for anyone interested in the current state of AI and its potential to produce realistic and immersive multimedia experiences. Prerequisite(s):
General data science knowledge Familiarity with neural networks and experience training them Proficiency in Python