Accelerate Deep Learning Inference in Production with TensorRT
, NVIDIA
, NVIDIA
Highly Rated
TensorRT is an SDK for high-performance deep learning inference used in production to minimize latency and maximize throughput. The latest generation of TensorRT provides a new compiler to accelerate specific workloads optimized for NVIDIA GPUs. Deep learning compilers need to have a robust method to import, optimize, and deploy models. We'll show a workflow to accelerate frameworks including PyTorch, TensorFlow, and ONNX. New users can learn about the standard workflow, while experienced users can pick up tips and tricks to optimize specific use-cases.