NVIDIA Nemotron

High-efficiency, multimodal, open models for long-running AI agents.

Overview
Video
Benefits
Models
Technology
Adopters
Resources
FAQs
Next Steps

Overview
Video
Benefits
Models
Technology
Adopters
Resources
FAQs
Next Steps

Overview

What Is NVIDIA Nemotron?

NVIDIA Nemotron™ is a family of highly efficient, multimodal, open AI models built for long-running, self-evolving agents. Designed for fast task completion, Nemotron models deliver high reasoning throughput and leading accuracy for complex agent workflows.

With transparent training data and broad platform support, including NVIDIA RTX PRO™ and NVIDIA DGX Spark™, Nemotron models are openly available and integrated across the AI ecosystem, enabling trusted, high‑performance AI agents to be deployed anywhere from edge to cloud.

Build Benchmark-Leading Agents With LangChain and NVIDIA

LangChain Deep Agents tuned for NVIDIA Nemotron™ 3 Ultra give enterprises open, customizable, frontier-level agents at a fraction of the cost.

Read the Blog

Video

Why NVIDIA Built Nemotron

Why NVIDIA Made Nemotron, featuring Bryan Catanzaro, VP of Applied Deep Learning Research at NVIDIA

Hear from Bryan Catanzaro, VP of applied deep learning research at NVIDIA, as he shares the vision behind Nemotron and why open technologies are essential for building trusted, enterprise-ready AI.

Watch Video (03:41)

Benefits

What Does Nemotron Bring to Agentic AI?

Open Models

NVIDIA’s open data and optimization techniques ensure powerful, transparent, and adaptable models for developers and enterprises. Models and training data are published openly on Hugging Face.

High Compute Efficiency

The Nemotron family is optimized to complete agentic tasks faster with the highest throughput and hybrid MoE architecture.

High Accuracy

Built from the ground up with exceptional knowledge, post-trained with high-quality training data, and aligned with reinforcement learning, Nemotron models achieve leading accuracy for long-running agentic systems.

Secure and Simple Deployment

The Nemotron model family, available as optimized NVIDIA NIM™ microservices, offers peak inference performance and flexible deployment options, ensuring superior security, privacy, and portability.

Models

Models for Diverse Workloads

Nemotron models excel in a range of agentic AI tasks, including reasoning, multimodal vision, retrieval-augmented generation (RAG), speech, and safety. Research models are also available for experimentation.

Reasoning

Nemotron models support a range of reasoning workloads. Nano provides superior accuracy and efficiency for specialized sub-agents, Super offers the highest accuracy, throughput reasoning, and tool calling to run complex tasks on multi-agent systems, and Ultra delivers the best reasoning for mission-critical applications that demand maximum capability over multi-step workflows.

Visual Understanding

Multimodal Nemotron models deliver the highest efficiency and leading accuracy across video, audio, image, and text for enterprise agentic use cases. Optimized for specialized sub‑agents, they power capabilities such as computer‑use agents, document intelligence, and video and audio understanding.

Speech

NVIDIA Nemotron Speech models provide high-throughput, ultra-low latency automatic speech recognition (ASR), text-to-speech (TTS), and neural machine translation (NMT) for agentic AI applications.

Retrieval-Augmented Generation

Nemotron Retriever models deliver fast, accurate document understanding by extracting multimodal structured information, generating high-quality embeddings, and rank-ordering the most relevant documents. They provide scalable, high-speed retrieval that enhances data quality for LLM training, boosts agent and retriever performance, and streamlines document workflows.

Safety

NVIDIA Nemotron Safety models provide real-time protection against harmful content, off-topic drift, and jailbreak attempts. They add a multilingual, multimodal, content safety layer with reasoning capabilities, enhancing moderation and ensuring cultural alignment.

View All Nemotron Models

Technology

Building Blocks for Agentic AI

Start building and optimizing AI agents with NVIDIA NeMo™ for custom agentic AI, NVIDIA NIM for fast, enterprise-ready deployment, and NVIDIA Blueprints for accelerating development with customizable reference workflows.

NVIDIA NeMo

Build, customize, and deploy generative AI and agentic AI.
Deliver enterprise-ready large language models (LLMs) with precise data curation, cutting-edge customization, scalable data ingestion, RAG, and accelerated performance.
Easily build data flywheels and continuously optimize AI agents with the latest information.

Get Started With NeMo

NVIDIA NIM

Speed up deployment of performance-optimized generative AI models.
Run your business applications with stable and secure APIs, backed by enterprise-grade support.

Get Started With NIM

NVIDIA Blueprints

Quickly get started with reference applications for generative AI use cases, such as enterprise deep research and multimodal RAG.
Accelerate development with blueprints, which include partner microservices, one or more AI agents, reference code, customization documentation, and a Helm chart for deployment.

Get Started With Blueprints

Starting Options

Ways to Get Started With Nemotron

Start Prototyping for Free

Get started with easy-to-use API endpoints.

Access fully accelerated AI infrastructure.
Ensure your data isn't used for model training.
No credits, just a simple path to build, test, and deploy.

Build Now

Run Nemotron on Inference Service Providers

Deploy Nemotron models instantly on trusted third-party inference platforms—no infrastructure setup required.

Deploy without managing infrastructure.
Scale seamlessly from prototype to production.
Optimize costs with usage-based pricing.

Explore Inference Providers

Get in Touch

Talk to an NVIDIA AI specialist about moving generative AI pilots to production with the security, API stability, and support that comes with NVIDIA AI Enterprise.

Explore your generative AI use cases.
Discuss your technical requirements.
Align NVIDIA AI solutions to your goals and requirements.

Contact Sales

Adopters

Enterprises Using Nemotron

Resources

Explore the Latest in Nemotron

Blogs
Sessions
Videos

See All Tech Blogs See All Topic News

View All Sessions

Why NVIDIA Built Nemotron

Learn how Nemotron accelerates innovation, empowers developers, and shapes the future of AI.

Watch Video

How ServiceNow Is Pushing Document Intelligence Forward

Learn how access to Nemotron’s model weights, datasets, and training recipes enabled deeper evaluation, what ServiceNow discovered about visual Q&A accuracy, and why openness matters for continuous improvement in multimodal AI.

Watch Video

Reasoning On/Off: Navigating a Wedding Seating Chart With AI Reasoning

See how an LLM with AI reasoning capabilities thinks outside the box to come up with a solution to a wedding seating chart while navigating family dynamics and guest preferences.

Watch Video

View All Videos

FAQs

NVIDIA Nemotron models aren't just open, but truly open source. NVIDIA publishes the training datasets, techniques, and model weights so the open-source community can benefit from our learnings and use these resources to create their own models.

The NVIDIA Open Model License is a permissive license that allows users to use, modify, distribute, and commercially deploy the models and derivatives without crediting NVIDIA, to encourage innovation and further development of generative AI.

Yes, you can download and run NVIDIA Nemotron models from Hugging Face for free in production.

NVIDIA also offers Nemotron models as NVIDIA NIM microservices for secure, scalable deployment, which requires an NVIDIA AI Enterprise license. You can try the Nemotron models and download the NIM microservices from build.nvidia.com.

Yes, NVIDIA is committed to publishing more Nemotron models, datasets, and techniques to enable open-source ecosystems.

NVIDIA Nemotron models are built on top of frontier open models, making it possible to build better models faster. Additionally, NVIDIA publishes the model weights, training datasets, and training techniques so the developer community can use these different parts of Nemotron to train their own models.

NVIDIA provides a variety of tools, such as NVIDIA Dynamo, TensorRT-LLM, and NIM, to run Nemotron models at scale in production. You can also use popular open-source libraries, such as SGLang and vLLM.

Next Steps

Ready to Get Started?

Use the right tools and technologies to take NVIDIA Nemotron models from development to production.

Get Started

Get in Touch

Talk to an NVIDIA product specialist about moving from pilot to production with the security, API stability, and support that comes with NVIDIA AI Enterprise.

Stay Up to Date on NVIDIA Agentic AI News

Get the latest agentic AI news, technologies, breakthroughs, and more sent straight to your inbox.

Stay Informed

NVIDIA Nemotron

Overview

What Is NVIDIA Nemotron?

Build Benchmark-Leading Agents With LangChain and NVIDIA

Why NVIDIA Built Nemotron

Benefits

What Does Nemotron Bring to Agentic AI?

Open Models

High Compute Efficiency

High Accuracy

Secure and Simple Deployment

Models

Models for Diverse Workloads

Reasoning

Visual Understanding

Speech

Retrieval-Augmented Generation

Safety

Technology

Building Blocks for Agentic AI

NVIDIA NeMo

NVIDIA NIM

NVIDIA Blueprints

Starting Options

Ways to Get Started With Nemotron

Start Prototyping for Free

Run Nemotron on Inference Service Providers

Get in Touch

Adopters

Enterprises Using Nemotron

Resources

Explore the Latest in Nemotron

Why NVIDIA Built Nemotron

How ServiceNow Is Pushing Document Intelligence Forward

Reasoning On/Off: Navigating a Wedding Seating Chart With AI Reasoning

FAQs

Are NVIDIA Nemotron models open or open source?

What does the NVIDIA Open Models License provide to developers?

Can I run NVIDIA Nemotron models for free in production?

Can I run NVIDIA Nemotron models as NVIDIA NIM?

Is NVIDIA committed to publishing more Nemotron models in the future?

How do the NVIDIA Nemotron models compare to other open models?

How do I optimize the models to run at scale in production?

Next Steps

Ready to Get Started?

Get in Touch

Stay Up to Date on NVIDIA Agentic AI News