Accelerate Deep Learning Inference with TensorRT 8.0
, NVIDIA
Highly Rated
TensorRT is an SDK for high-performance deep learning inference used in production to minimize latency and maximize throughput. The upcoming TensorRT 8.0 release provides features such as sparsity optimized for NVIDIA Ampere GPUs, quantization-aware training, and enhanced compiler to accelerate transformer-based networks. Deep learning compilers need to have a robust method to import, optimize, and deploy models. New users can learn about the common workflow, while experienced users can learn more about new TensorRT 8.0 features.