Simulation can unlock novel use cases by bootstrapping the training of foundational models or speed up the fine-tuning process of pretrained AI models with synthetic data generation, or SDG. It can consist of text, 2D, or 3D images in the visual and non-visual spectrum, and even motion data that can be used in conjunction with real-world data to train multimodal physical AI models.
Domain randomization is a key step in the SDG workflow, where many parameters in a scene can be changed to generate a diverse dataset—from location, to color, to textures to lighting of the objects. Augmentation in the post-processing phase further diversifies generated data by adding defects such as localized blurring, pixelation, randomized cropping, skewing, and blending.
Additionally, the images generated are automatically annotated and can include RGB, bounding boxes, instance and semantic segmentation, depth, depth point cloud, lidar point cloud, and more.