Physical AI

NVIDIA Cosmos

Develop world foundation models to advance physical AI.

Overview

Overview

What is NVIDIA Cosmos?

NVIDIA Cosmos™ is a platform of state-of-the-art generative world foundation models (WFMs), advanced tokenizers, guardrails, and an accelerated data processing and curation pipeline. It is built to power world model training and accelerate physical AI development for autonomous vehicles (AVs) and robots.

New Models Enable Prediction, Controllable World Generation and Reasoning for Physical AI

Introducing the world’s first reasoning model for physical AI development, giving developers unprecedented control over world generation.

Read Press Release

Scale Synthetic Data and Physical AI Reasoning with NVIDIA Cosmos

Explore the latest NVIDIA Cosmos WFMs for advanced reasoning and controllable synthetic data generation, enabling the next generation of AI-driven humanoids and autonomous vehicles.

Read Tech Blog

Benefits

Accelerate World Generation for Physical AI

Cosmos provides developers easy access to high-performance world foundation models, data pipelines, and tools to post-train these models for robotics and autonomous driving tasks.

Physics First Data

World foundation models are pre-trained on 20 million hours of robotics and driving data to generate world states grounded in physics.

Open

Cosmos WFMs, guardrails, and tokenizers are licensed under the NVIDIA Open Model License, allowing access to all physical AI developers.

Models

Cosmos World Foundation Models

A family of pretrained multimodal models that developers can use out-of-the-box for world generation and reasoning, or post-train to develop specialized physical AI models.

Predict

Generalist model for world generation and motion prediction from multimodal input. Trained on 9,000T tokens of robotics and driving data and purpose-built for post-training.

Available as Cosmos NIM for accelerated inference anywhere.

Transfer

Physics-aware world generation conditioned on ground-truth and 3D inputs. Input includes segmentation maps, depth signals, LiDAR scans, key points, trajectories, HD maps, and ground-truth simulation from NVIDIA Omniverse™ for controllable synthetic data generation.

Reason

Fully customizable, multimodal reasoning model for planning response based on spatial and temporal understanding.

Trained using visual-language model fine-tuning and reinforcement learning for chain-of-thoughts reasoning.

Guardrails

Develop responsible models using Cosmos WFM with pre-guard for filtering unsafe input and post-guard for consistent and safe outputs.

Tools

Post-train Cosmos World Foundation Models

Cosmos provides developers with open and highly performant data curation pipelines, tokenizers, training framework and post-training scripts to quickly and easily build specialized world models like policy models and visual language action (VLA) models for embodied AI.

Efficiently Tokenize Video Data

Use Cosmos tokenizers to generate image or video tokens at higher compression rates—for scalable, robust, and efficient development of large world models. Choose high-res or low-res variants for post-training Cosmos WFMs into specialized AI models.

Learn More

Speed Up Data Curation

Speed up data curation by 20X with the NVIDIA NeMo™ Curator pipeline of CUDA-X™ and NVIDIA AI-accelerated tooling for processing over 100PB of data. It provides out-of-the-box optimizations, minimizing the total cost of ownership (TCO) and accelerating time to market.

Learn More

Fully Managed Development Support

NVIDIA DGX Cloud is a high-performance AI platform for accelerated training, enabling developers to curate data, post-train, and deploy video and world foundation models with a fully managed service.

Learn More

Use Cases

How Developers Use NVIDIA Cosmos

Developers post-train Cosmos WFMs or couple with NVIDIA Omniverse to drive downstream physical AI use cases.

Synthetic Data Generation (SDG)
Policy Model Initialization
Policy Model Evaluation
Multiverse Engine

Synthetic Data Generation (SDG)

Cosmos accelerates synthetic data generation to train perception AI models.

Omniverse provides generative APIs, tools, and NVIDIA RTX™ rendering to create physically accurate ground-truth 3D scenes for Cosmos WFM. Using these visuals as inputs, Cosmos Transfer WFM generates photorealistic outputs—simulating diverse weather, environments, and lighting—while predicting world states with physical accuracy, based on text prompts.

Developers can use generalist Cosmos WFMs out of the box or customize them with their own data for greater precision in downstream SDG.

Get Started With Synthetic Data Generation

Policy Model Initialization

A policy model guides a physical AI system’s behavior, ensuring that the system operates with safety and in accordance with its goals. Cosmos Predict or Cosmos Reason can be post-trained into policy models to generate actions, saving the cost, time, and data needs of manual policy training.

Learn More

Ensuring the reliability and safety of physical AI systems for robotics and autonomous vehicle

Policy Model Evaluation

Cosmos WFMs accelerate policy evaluation by simulating real-world actions through video outputs, using Omniverse ground-truth physics for accuracy. Developers can build a vision-language-action (VLA) model using Cosmos Reason and add it to critique and drive actions. This simulation loop reduces the cost, time, and risk of real-world testing while improving policy precision.

Learn More

Multiverse Engine

Cosmos WFMs can be post-trained to act as a multiverse engine or system—exploring multiple task strategies, rewarding the most effective outcomes, and enhancing decision-making for predictive control and reinforcement learning. Developers can add a reward module to Cosmos WFMs and simulate outcomes in Omniverse.

Coming Soon

Our Commitment

Democratizing Trustworthy AI for Physical AI Community

Cosmos models, guardrails, and tokenizers are available on Hugging Face and GitHub, with resources to tackle data scarcity in training physical AI models. We are committed to driving Cosmos forward— transparent, open, and built for all.

Ecosystem

Adopted by Leading Physical AI Innovators

Model developers from robotics, autonomous vehicles, and vision AI industries are using Cosmos to accelerate physical AI development.

Next Steps

Ready to Get Started?

Test drive a world foundation model in the NVIDIA API catalog or start building your world models using NVIDIA Cosmos.

Try Now Start Developing

Post-Train WFMs

Use NVIDIA NeMo’s end-to-end pipeline to curate, tokenize, and fine-tune world models on any platform.

Learn More

Curate Video Data For World Models

Leverage an accelerated data processing and curation pipeline powered by NVIDIA NeMo Curator and optimized for NVIDIA data center GPUs.

Apply for Early Access

Frequently Asked Questions

Yes, there are two approaches to post-train Cosmos models:

1) Using NeMo, you can efficiently train and fine-tune models with popular techniques like Low-Rank Adaption (LoRA) and Reinforcement Learning from Human Feedback (RLHF). You can also choose PyTorch to continue training the WFMs using your own datasets.

2) You can use open PyTorch scripts from GitHub to post-train Cosmos WFM.

Yes, you can leverage Cosmos to build from scratch with your preferred foundation model or model architecture. You can start by using NeMo Curator for video data pre-processing. Then compress and decode your data with Cosmos tokenizer. Once you have processed the data, you can train or fine-tune your model using NVIDIA NeMo.

Using NVIDIA NIM™ microservices, you can easily integrate your physical AI models in your applications across cloud, data centers, and workstations.

You can also use NVIDIA DGX Cloud to train AI models and deploy them anywhere at scale.

Omniverse creates realistic 3D simulations of real-world tasks by using different generative APIs, SDKs, and NVIDIA RTX rendering technology.

Developers can input Omniverse simulations as instruction videos to Cosmos Transfer model to generate controllable photoreal synthetic data.

Together, Omniverse provides the simulation environment before and after training, while Cosmos provides the foundation models to generate video data and train physical AI models.

Learn more about NVIDIA Omniverse.

NVIDIA Cosmos

What is NVIDIA Cosmos?

New Models Enable Prediction, Controllable World Generation and Reasoning for Physical AI

Scale Synthetic Data and Physical AI Reasoning with NVIDIA Cosmos

Benefits

Accelerate World Generation for Physical AI

Physics First Data

Open

Cosmos World Foundation Models

Predict

Transfer

Reason

Guardrails

Tools

Post-train Cosmos World Foundation Models

Efficiently Tokenize Video Data

Speed Up Data Curation

Fully Managed Development Support

Use Cases

How Developers Use NVIDIA Cosmos

Synthetic Data Generation (SDG)

Our Commitment

Democratizing Trustworthy AI for Physical AI Community

Ecosystem

Adopted by Leading Physical AI Innovators

Next Steps

Ready to Get Started?

Post-Train WFMs

Curate Video Data For World Models

Frequently Asked Questions

How can I get started with NVIDIA Cosmos?

What is the licensing model for Cosmos world foundation models?

Can I fine-tune or post-train Cosmos models for my downstream applications?

Can I build a world model from scratch using tools from the Cosmos platform and my custom or in-house foundation model?

What is the difference between Cosmos and Omniverse?