Accelerating and Scaling Inference with NVIDIA GPUs

, Senior Deep Learning Solution Architect, NVIDIA
高度评价

Learn how to use GPUs to deploy machine learning models to production scale with the Triton Inference Server. At scale machine learning models can interact with up to millions of users in a day. As usage grows, the cost of both money and engineering time can prevent models from reaching their full potential. It’s these types of challenges that inspired creation of Machine Learning Operations (MLOps). Practice Machine Learning Operations by: Deploying neural networks from a variety of frameworks onto a live Triton Server Measuring GPU usage and other metrics with Prometheus Sending asynchronous requests to maximize throughput Upon completion, learners will be able to deploy their own machine learning models on a GPU server. Prerequisite(s): Familiarity with at least one Machine Learning framework such as: PyTorch, TensorFlow, ONNX, TensorRT Familiarity with Docker recommended but not required.

 

*Please disregard any reference to "Event Code" for access to training materials. "Event Codes" are only valid during the original live session.

 

Explore more training options offered by the NVIDIA Deep Learning Institute (DLI). Choose from an extensive catalog of self-paced, online courses or instructor-led virtual workshops to help you develop key skills in AI, HPC, graphics & simulation, and more.

活动: GTC Digital September
日期: September 2022
行业: All Industries
话题: Deep Learning - Inference
级别: Intermediate Technical
语言: English
所在地: