Visit your regional NVIDIA website for local content, pricing, and where to buy partners specific to your country.
Find the right license to deploy, run, and scale AI for any application on any platform.
For individuals looking to get access to Triton Inference Server open-source code for development.
For individuals looking to access free Triton Inference Server containers for development.
For enterprises looking to purchase Triton for production.
NVIDIA Triton Inference Server, or Triton for short, is an open-source inference serving software. It lets teams deploy, run, and scale AI models from any framework (TensorFlow, NVIDIA TensorRT™, PyTorch, ONNX, XGBoost, Python, custom, and more) on any GPU- or CPU-based infrastructure (cloud, data center, or edge). For more information, please visit the Triton webpage.
Triton Model Analyzer is an offline tool for optimizing inference deployment configurations (batch size, number of model instances, etc.) for throughput, latency, and/or memory constraints on the target GPU or CPU. It supports analysis of a single model, model ensembles, and multiple concurrent models.
Triton is included with NVIDIA AI Enterprise, an end-to-end AI software platform with enterprise-grade support, security stability, and manageability. NVIDIA AI Enterprise includes Business Standard Support that provides access to NVIDIA AI experts, customer training, knowledge base resources, and more. Additional enterprise support and services are also available, including business-critical support, dedicated technical account manager, training, and professional services. For more information, please visit the Enterprise Support and Services User Guide.
Yes, there are several labs that use Triton in NVIDIA Launchpad.
NVIDIA LaunchPad is a program that provides users short-term access to Enterprise NVIDIA Hardware and Software via a web browser. Select from a large catalog of hands-on labs to experience solutions surrounding use cases from AI and data science to 3D design and infrastructure optimization. Enterprises can immediately tap into the necessary hardware and software stacks on private hosted infrastructure.
Yes, Triton is the top ecosystem choice for AI inference and model deployment. Triton is available in AWS, Microsoft Azure, and Google Cloud marketplaces with NVIDIA AI Enterprise. It’s also available in Alibaba Cloud, Amazon Elastic Kubernetes Service (EKS), Amazon Elastic Container Service (ECS), Amazon SageMaker, Google Kubernetes Engine (GKE), Google Vertex AI, HPE Ezmeral, Microsoft Azure Kubernetes Service (AKS), Azure Machine Learning, and Oracle Cloud Infrastructure Data Science Platform.
Stay up to date on the latest AI inference news from NVIDIA.
NVIDIA Privacy Policy