PyTorch-Direct: Introducing Deep Learning Framework with GPU-Centric Data Access for Faster Large GNN Training
, University of Illinois At Urbana-Champaign
, University of Illinois Urbana Champaign
We'll introduce PyTorch-Direct, an extension to the PyTorch framework to enable efficient host memory access with complicated data-access patterns. PyTorch-Direct mainly targets for training very large graph neural networks (GNNs), where the whole input features can't fit into the GPU memory. Existing PyTorch implementation of out-of-GPU-memory GNN training relies on CPUs to gather features into a single contiguous buffer and then to launch DMA API calls. Instead, PyTorch-Direct utilizes NVIDIA GPU’s zero-copy memory access feature to directly access the scattered features across the host memory. Learn how several interesting optimizations in PyTorch-Direct can significantly reduce the overall GNN training time. Existing PyTorch applications, including those based on DGL, can benefit from PyTorch-Direct by changing two lines of code per object that needs to be placed in the host memory.