Learn how leading open models and software are continuously optimized and accelerated for peak performance by NVIDIA’s full-stack inference solutions.
Do you need to compute larger or faster than a single GPU allows, but a multi-GPU library providing the functionality you need isn't available? Learn how to scale your application to multiple GPUs and multiple nodes with the different available multi-GPU communication libraries. We'll introduce CUDA-aware MPI, NVSHMEM, and NCCL using a real-world application example.
Important: Near capacity, highly suggest arriving early. Attendees are let in on a first-come, first-served basis.