Name: Accelerated LLM Model Alignment and Deployment in NeMo, TensorRT-LLM, and Triton Inference Server | GTC 24 2024 | NVIDIA On-Demand
Uploaded: 2024-03-19T13:00:00Z
Duration: 5886 s
Description: The demand for accelerated large language models (LLMs) has surged with the growing popularity of generative models

This site requires Javascript in order to view all its content. Please enable Javascript in order to access all the functionality of this web site. Here are the instructions how to enable JavaScript in your web browser.

詳細內容

字幕

The demand for accelerated large language models (LLMs) has surged with the growing popularity of generative models. These models, often boasting billions of parameters, hold immense potential, but also pose challenges during large-scale deployments. Join us as we delve into the world of accelerated LLM model alignment using the NeMo Framework and inference optimization and deployment through NVIDIA's TensorRT-LLM and Triton Inference Server.
We'll spotlight the innovative SFT and PEFT Fine-tuning (LoRA) approach, a key component for LLM alignment. We'll also uncover the intricacies of inference optimization using TensorRT-LLM, highlighting kv-caching, paged attention, in-flight batching, and the pivotal role it plays in making LLMs faster and more cost-effective. We'll take you through the crucial steps of fine-tuning, optimizing, and deploying LLaMA model in production environment using Triton Inference Server.
Prerequisite(s):

Familiarity with Python, Large language Models, and Deep Learning Frameworks

Explore more training options offered by the NVIDIA Deep Learning Institute (DLI). Choose from an extensive catalog of self-paced, online courses or instructor-led virtual workshops to help you develop key skills in AI, HPC, graphics & simulation, and more.
Ready to validate your skills? Get NVIDIA certified and distinguish yourself in the industry.

活動:

日期:

技術水平需求:

領域:

產業:

NVIDIA technology: Cloud / Data Center GPU,HGX,NCCL,NeMo,TensorRT,Triton

語言: English

地區:

Fill out this form to enjoy this content

Section

Section

名

姓

電子郵件

組織 / 大學名稱

我願意收到下列有關 NVIDIA 的最新消息與公告：

企業業務解決方案

開發人員技術和工具

(非必選) 您可以隨時取消訂閱。

NVIDIA 隱私權政策

Follow Nvidia