Accelerating Applications for NERSC's Perlmutter Supercomputer Using OpenMP and NVIDIA's HPC SDK
, Lawrence Berkeley National Laboratory
, NVIDIA
Learn about the National Energy Research Scientific Computing Center/NVIDIA collaboration to provide production-quality OpenMP target offload compilers on the NERSC-9 Perlmutter supercomputer with NVIDIA A100 GPUs. The two-year partnership resulted in the NVIDIA HPC SDK providing significant support for OpenMP target offload in C, C++ and Fortran. We'll review how the compiler was designed to meet the most essential needs of U.S. Department of Energy OpenMP applications. We'll show that the compilers not only meet these needs, but also achieve high performance that's often on par with NVIDIA's highly tuned and mature OpenACC implementation. We'll draw special attention to our minor application code changes, including using the OpenMP "loop" construct to obtain high performance on both CPUs and NVIDIA GPUs. We'll conclude by providing a list of concrete recommendations to OpenMP programmers so that they, too, can obtain high performance on NVIDIA's current GPUs, and structure their code in a way most likely to perform well on NVIDIA's future GPUs.