Get Started With NVIDIA Triton

Find the right license to deploy, run, and scale AI for any application on any platform..

NVIDIA Triton Licensing Options

	GitHub For individuals looking to get access to Triton open-source code for development.	NVIDIA NGC™ For individuals looking to access free Triton containers for development.	NVIDIA AI Enterprise For enterprises looking to purchase Triton for production.
Features	Access Code	Get Container	Contact Sales
NVIDIA Triton™ Inference Server
Custom builds (Windows, NVIDIA® Jetson™), PyTriton
Prebuilt Docker container (version dependencies: CUDA®, framework backends)
Triton Management Service - Model orchestration for large-scale deployment
Enterprise support – 24x7 case filing, 8-5 NVIDIA live agent
Long-term support branch for up to three years
CVE scans, security notifications, timely patches, and maintenance releases
API stability with production releases
NVIDIA customer portal and knowledge base
Access to NVIDIA AI workflows and reference architectures
Management and orchestration of workloads and infrastructure
Hands-on NVIDIA LaunchPad labs			Try Now

FAQs

NVIDIA Triton Inference Server, or Triton for short, is an open-source inference serving software. It lets teams deploy, run, and scale AI models from any framework (TensorFlow, NVIDIA TensorRT™, PyTorch, ONNX, XGBoost, Python, custom, and more) on any GPU- or CPU-based infrastructure (cloud, data center, or edge). For more information, please visit the Triton webpage.

Triton Management Service is for automated and resource-efficient orchestration of models for inference at scale. It automates deployment of Triton instances, model loading on demand, unloading when not in use, and more. Triton Management Service is available exclusively with NVIDIA AI Enterprise, an enterprise-ready AI software platform.

Triton Model Analyzer is an offline tool for optimizing inference deployment configurations (batch size, number of model instances, etc.) for throughput, latency, and/or memory constraints on the target GPU or CPU. It supports analysis of a single model, model ensembles, and multiple concurrent models.

Triton is included with NVIDIA AI Enterprise, an end-to-end AI software platform that offers enterprise-grade support, security stability, and manageability for the entire software stack across data center and cloud.

NVIDIA AI Enterprise includes bsiness-standard support. There are additional support and services available, including business-critical support, access to a technical account manager, training, and professional services. For more information, please visit the Enterprise Support and Services User Guide.

Yes, there are several labs that use Triton in NVIDIA Launchpad.

Yes, Triton is the top ecosystem choice for AI inference and model deployment. Triton is available in AWS, Microsoft Azure, and Google Cloud marketplaces with NVIDIA AI Enterprise. It’s also available in Alibaba Cloud, Amazon Elastic Kubernetes Service (EKS), Amazon Elastic Container Service (ECS), Amazon SageMaker, Google Kubernetes Engine (GKE), Google Vertex AI, HPE Ezmeral, Microsoft Azure Kubernetes Service (AKS), Azure Machine Learning, and Oracle Cloud Infrastructure Data Science Platform.

Stay up to date on the latest AI inference news from NVIDIA.

Get Started With NVIDIA Triton

NVIDIA Triton Licensing Options

GitHub

NVIDIA NGC™

NVIDIA AI Enterprise

Features

FAQs

What is NVIDIA Triton Inference Server?

What is NVIDIA Triton Management Service?

When should I use the Triton Model Analyzer?

How can customers get enterprise support for Triton?

What enterprise support and services are available for Triton?

Is there a Triton lab in NVIDIA Launchpad?

Is Triton available from cloud service providers?