Prototyping and Debugging Deep Learning Inference Models Using TensorRT's ONNX-Graphsurgeon and Polygraphy Tools
, NVIDIA
Deep learning researchers and engineers usually have to spend a significant amount of time debugging accuracy and performance of their deep learning inference models before deploying them. TensorRT recently open-sourced some more tools to assist with the development and debugging of deep neural networks for inference. ONNX GraphSurgeon is a tool that allows you to easily generate new ONNX graphs, or modify existing ones. This can be useful in scenarios like using custom implementations for parts of the ONNX graph, in place of those provided by TensorRT. Polygraphy is a toolkit designed to assist in running and debugging deep learning models in various frameworks. It includes a Python API and several command-line tools built using this API. These tools allow displaying information about models, such as network structure; determining which layers of a TensorRT network need to be run in a higher precision for accuracy; and comparing inference results across frameworks, among other features.