PyTorch Model Optimization and Deployment for High-Performance Inference
, Product Manager, NVIDIA
, Software Engineer, Meta
, Software Engineer, Meta
Learn about optimizing & deploying dynamic PyTorch models in Python for production. We’ll cover how Meta & NVIDIA are working together to provide improved flexibility, usability, and performance on NVIDIA GPUs by applying Torch FX inside Torch-TensorRT. We’ll dive into how FX is being used under the hood of Torch-TensorRT and show participants how to accelerate their PyTorch models purely in Python, creating a flexible deployment package for anywhere from the cloud to the edge. Then participants will see real-world use cases where this is being done today in production. Finally, we’ll highlight some of the future plans for PyTorch Inference on NVIDIA GPUs to continue improving performance, coverage, and ease of use for the entire PyTorch community.