Scaling and Optimizing Your LLM Pipeline for End-to-End Efficiency
, Product Manager, AI on Google Kubernetes Engine, Google
, Cloud Architect, Google
Are you having trouble getting language models (LLMs) to work in your organization? You're not alone. We'll look at how to deploy an open-source language model on GKE. We'll show data scientists and machine learning engineers how to use NeMo and TRT LLM with GKE's notebooks. Plus, GKE has a unique ability to help orchestrate AI workloads with efficiency and convenience. We'll also demonstrate how to train and tune a language model using NeMo and do a live technical demo of how data science teams can infer these models on GPUs with TRT LLM and GKE.