NVIDIA NeMo Service

Cloud service for enterprise hyper-personalization and at-scale deployment of intelligent large language models.

Overview
Product Features
Benefits
Additional Resources
Get Early Access

Overview

Overview
Product Features
Benefits
Additional Resources
Get Early Access

NVIDIA NeMo™ service, part of NVIDIA AI Foundations, is a cloud service that kick-starts the journey to hyper-personalized enterprise AI offering state-of-the-art foundation models, customization tools, and deployment at-scale. Define your operating domain, encode the latest proprietary knowledge, add specialized skills, and continuously make applications smarter.

Leveraging cloud APIs, quickly and easily integrate generative AI capabilities into your enterprise applications.

Picasso
BioNeMo
NeMo Framework

Generative AI Language Use Cases

Build your own language models to deploy for intelligent enterprise generative AI applications.

Content Generation

Marketing content
Product description generation

Summarization

Legal paraphrasing
Meeting notes summarization

Chatbot

Question and answering
Customer service agent

Information Retrieval

Passage retrieval and ranking
Document similarity

Classification

Toxicity classifier
Customer segmentation

Translation

Language-to-code
Language-to-language

State-of-the-Art AI Foundation Models

Large language models (LLMs) are hard to develop and maintain, requiring mountains of data, significant capital investment, technical expertise, and massive-scale compute infrastructure.

Enterprises can kick-start their journey to adopting LLMs by starting with a pre-trained foundation model.

The 5 NeMo Generative AI Foundation Models

Five NeMo Generative Foundation Models

GPT-8:

8B parameters, with supervised fine-tuning. Trained on 1.1T tokens, with sequence length of 4K tokens.
Provides fast responses, meeting application service-level agreements for simple tasks.
Use cases: Text classification, spelling correction

GPT-43:

43B parameters, with supervised fine-tuning. Supports over 50 languages. Trained on 1.1T tokens, with sequence length of 4K tokens.
Provides an optimal balance of high accuracy and low latency.
Use cases: Email composition, factual Q&A

GPT-530:

530B parameters, with supervised fine-tuning. Trained on 340B tokens, with sequence length of 2K tokens.
Excellent for complex tasks that require deep understanding of human languages and all their nuances.
Use cases: Text summarization, creative writing, chatbots

Inform:

Excellent for tasks that require latest proprietary knowledge.
Use cases: Enterprise intelligence, information retrieval, Q&A

mT0-xxl:

Community-built model with 13B parameters supporting more than 100 languages, alongside supervised fine-tuning. Trained with sequence length of 2K tokens.
Use cases: Language translation, language understanding, Q&A

Curated Techniques for Enterprise Customization

Foundation models are great out of the box, yet they can’t easily be made useful for a specific enterprise task. They are trained on publicly available information, frozen in time, hallucinate, and contain bias and toxic information.

Enterprises need to customize foundation models for their specific generative AI use cases.

1 Define Focus

Add guardrails and define the operating domain for your enterprise model through fine-tuning or prompt learning techniques to prevent LLMs from veering off into unwanted domains or saying inappropriate things.

2 Add Knowledge

Encode and embed your AI with your enterprise’s real-time information using Inform to provide the latest responses.

3 Add Skills

Add specialized skills to solve customer and business problems. Get better responses by providing context for specific use cases using prompt learning techniques.

4 Continuously Improve

Reinforcement learning with human feedback (RLHF) techniques allow for your enterprise model to get smarter over time, aligned to human intentions.

Build Intelligent Language Applications Faster

Customize Easily

Curated training techniques for enterprise hyper-personalization

Achieve Higher Accuracy

Best-in-class suite of foundation models design for customization, trained with up to 1T tokens

Run Anywhere

Run inference of large-scale custom models in the service or deploy across clouds or private data centers with NVIDIA AI Enterprise software.

Fastest Performance at Scale

State-of-the-art training techniques, tools, and inference—powered by NVIDIA DGX ™ Cloud.

Ease of Use

Easily access the capabilities of your custom enterprise LLM through just a few lines of code or an intuitive GUI-based playground.

Enterprise Support

Fully supported by NVIDIA AI experts every step of the way.

Adopted Across Industries

Take a deeper dive into product features.

Choose preferred foundation models.

Customize your choice of various NVIDIA or community-developed models that work best for your AI applications.

Accelerate customization.

Within minutes to hours, get better responses by providing context for specific use cases using prompt learning techniques. See NeMo prompt learning documentation.

Experience Megatron 530B.

Leverage the power of NVIDIA Megatron 530B, one of the largest language models, through the NeMo LLM Service.

Develop seamlessly across use cases.

Take advantage of models for drug discovery, included in the cloud API and NVIDIA BioNeMo framework.

Find more resources.

See How the NeMo Service Works

Enterprises can customize large language models, or use pre-trained foundation models, to fast-track their generative AI adoption across various use cases such as summarizing financial documents and creating brand-specific content.

Watch Video

GTC 2023 Keynote

Check out the GTC keynote to learn more about NVIDIA AI Foundations, NeMo framework, and much more.