Name: From Zero to Millions: Scaling Large Language Model Inference With TensorRT-LLM | GTC 24 2024 | NVIDIA On-Demand
Uploaded: 2024-03-20T16:00:00Z
Duration: 1465 s
Description: We'll give an overview of how we successfully utilized TensorRT-LLM to deploy large language models at scale, thereby supporting millions of users at Perpl

NVIDIA CEO の Jensen Huang による GTC 基調講演の録画を視聴して、発表内容や最新情報をご確認ください。

今すぐ視聴する

NVIDIA On-Demand

This site requires Javascript in order to view all its content. Please enable Javascript in order to access all the functionality of this web site. Here are the instructions how to enable JavaScript in your web browser.

詳細

字幕

We'll give an overview of how we successfully utilized TensorRT-LLM to deploy large language models at scale, thereby supporting millions of users at Perplexity.

イベント:

日付:

レベル:

業界:

トピック:

NVIDIA technology: CUDA,cuDNN,Hopper,NVLink / NVSwitch

言語: English

地域:

Fill out this form to enjoy this content

Section

Section

名

姓

メールアドレス

組織名/大学名

NVIDIA から最新ニュース、お知らせ等を受け取る:

企業向けビジネスソリューション

開発者向けテクノロジ & ツール

(任意) 配信停止はいつでも可能です。

NVIDIA プライバシーポリシー

Follow Nvidia