For Deep Learning performance, please go here.


Modern HPC data centers are key to solving some of the world’s most important scientific and engineering challenges. The NVIDIA Data Center GPUs fundamentally change the economics of the data center, delivering breakthrough performance with dramatically fewer servers, less power consumption, and reduced networking overhead, resulting in total cost savings of 5X-10X.

The number of CPU-only servers replaced by a single GPU-accelerated server is called the node replacement factor (NRF). To arrive at NRF, we measure application performance with up to 8 CPU-only servers. Then we use linear scaling to scale beyond 8 servers to calculate the NRF. The NRF will vary by application.


Detailed H200 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

24-AT_24

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.713276521,3332,6642935881,1762,359
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x28x56x114x227x25x50x100x201x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.693306691,3952,7822995961,1932,398
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x28x57x119x238x26x51x102x205x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.361,4062,8525,69012,4681,2632,5275,05510,180
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x15x31x61x134x14x27x54x109x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.501,4302,8975,86311,8541,2892,5815,22610,422
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x14x29x59x119x13x26x53x105x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.044,6899,48519,47937,6874,2508,42217,05631,382
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x12x25x52x100x11x22x45x83x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.044,8519,69219,75938,2464,3378,64017,26932,541
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x12x24x50x96x11x22x43x82x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.699418737574991182364728
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x25x51x102x203x25x49x99x197x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.072004007991,5991823647281,456
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x16x32x64x7x15x29x58x

AMBER is measured by running multiple independent instances using MPS


Chroma

Physics

Lattice Quantum Chromodynamics (LQCD)

VERSION

V2025.01

ACCELERATED FEATURES

  • Wilson-clover fermions, Krylov solvers, Domain-decomposition
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
ChromaFinal Timestep Time (Sec)HMC Mediumno10,037153885335160935946
ChromaNRFHMC Mediumyes1x65x116x193x289x63x110x175x224x

FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no112241497251598
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x5x10x16x20x4x10x15x18x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno15433191183520119
Fun3D [waverider-5M]NRFwaverider-5Myes1x5x13x23x31x4x12x21x26x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno4749148261697512819
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x5x15x27x44x5x14x25x36x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno628--3620--3825
Fun3D [waverider-20M]NRFwaverider-20Myes1x--23x41x--22x34x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,011--10254--10965
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x--29x55x--27x45x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

h-bond - 2025-rc

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
GROMACS [ADH Dodec]ns/dayADH Dodecyes3628571,6272,6735,3307731,4502,7005,430
GROMACS [ADH Dodec]NRFADH Dodecyes1x2x4x7x15x2x4x7x15x
GROMACS [STMV]ns/daySTMVyes2044761311984170123153
GROMACS [STMV]NRFSTMVyes1x2x4x8x13x2x3x7x10x

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
GTCMpush/Secmpi#proc.inyes1468211,5322,9995,4087491,4072,6914,780
GTCNRFmpi#proc.inyes1x6x11x22x40x5x10x20x35x

ICON

Weather and Climate

A global unified atmosphere model for numerical weather prediction and climate modeling research

VERSION

2024.8_RC

ACCELERATED FEATURES

  • Full model of dynamics and physics

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://code.mpimet.mpg.de/projects/iconpublic

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL
ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno58717114111398180148116
ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x3x4x5x6x3x4x5x
ICON [QUBICC 160 km resolution]Integrate_nh (sec)QUBICC 160 km resolutionno466143102796715010782
ICON [QUBICC 160 km resolution]NRFQUBICC 160 km resolutionyes1x3x5x6x7x3x4x6x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

patch_4Feb2025

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.95E+081.44E+092.69E+094.72E+097.80E+091.32E+092.45E+093.78E+096.33E+09
LAMMPS [LJ 2.5]NRFLJ 2.5yes1x4x7x13x21x3x6x10x17x
LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.44E+085.75E+081.09E+091.95E+093.17E+095.28E+081.00E+091.70E+092.52E+09
LAMMPS [EAM]NRFEAMyes1x4x8x14x23x4x7x12x18x
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.93E+061.15E+072.05E+073.33E+074.96E+071.06E+071.91E+072.98E+074.32E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x6x16x26x38x6x15x23x33x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.61E+064.24E+068.49E+061.69E+073.36E+073.88E+067.71E+061.53E+073.05E+07
LAMMPS [SNAP]NRFSNAPyes1x3x7x12x24x2x6x11x22x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.21E+081.03E+091.91E+093.46E+095.89E+099.43E+081.75E+093.04E+094.93E+09
LAMMPS [Tersoff]NRFTersoffyes1x5x10x18x31x4x9x16x26x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_cde2498

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
MILCTotal Time (sec)Apex Mediumno13,7359815343051911,018580334263
MILCNRFApex Mediumyes1x14x23x40x64x13x21x37x46x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
NAMD [LaINDY ColVars]ns/dayLaINDY ColVarsyes50.568917735269884164327651
NAMD [LaINDY ColVars]NRFLaINDY ColVarsyes1x2x4x7x14x2x3x6x13x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes108.793927841,5453,0173577001,4142,804
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x4x7x14x28x3x6x13x26x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes10.532551102203234693185
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x5x10x19x2x4x9x18x
NAMD [COVID-19 Spike Assembly]ns/dayCOVID-19 Spike Assemblyyes0.75361118358-
NAMD [COVID-19 Spike Assembly]NRFCOVID-19 Spike Assemblyyes1x4x8x15x24x4x6x11x-
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes10.8732641282572958116232
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x3x6x12x24x3x5x11x21x

NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


Quantum Espresso

Material Science (Quantum Chemistry)

An Open-source suite of computer codes for electronic structure calculations and materials modeling at the nanoscale

VERSION

V7.4

ACCELERATED FEATURES

  • linear algebra (matrix multiply)
  • explicit computational kernels
  • 3D FFTs

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

http://www.quantum-espresso.org

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)2x H2004x H2008x H2002x H200 NVL4x H200 NVL8x H200 NVL
Quantum EspresssoTotal CPU Time (Sec)GRIR443no78411489501167754
Quantum EspresssoNRFGRIR443yes1x12x16x28x12x19x26x

RELION

Microscopy

Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

VERSION

5.0.0

ACCELERATED FEATURES

  • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no8,9812,3551,2311,0512,3551,2311,051977
Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x7x9x4x7x9x9x

RTM

Geoscience

Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

VERSION

nvidia_2024_01

ACCELERATED FEATURES

  • Batch algorithm

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

http://www.tsunamidevelopment.com/assets/rtm.pdf

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,047194,141385,616770,6721,545,937184,860368,249736,4851,476,228
RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x9x18x37x73x9x17x35x70x
RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21331,58162,562125,334250,34225,81651,604103,104205,718
RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x4x9x17x35x4x7x14x29x
RTM [TTI RX 2Pass mgpu]Mcell/sTTI RX 2Pass mgpuyes7,21330,52759,893119,536238,88028,73857,080113,564227,150
RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x4x8x17x33x4x8x16x31x

SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

4.1.1

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H2002x H2004x H2008x H2001x H200 NVL2x H200 NVL4x H200 NVL8x H200 NVL
SPECFEM3DTotal Time (Sec)four_material_simple_modelno18638211294122128
SPECFEM3DNRFfour_material_simple_modelyes1x5x10x18x24x4x9x17x25x


Detailed GH200 96GB application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

24-AT_24

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes10.403051,296
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x29x125x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes10.433071,302
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x29x125x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes82.111,3395,510
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x16x67x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes90.621,3705,642
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x15x62x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes358.074,82718,286
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x13x51x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes365.314,91618,673
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x13x51x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.28101-
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x31x-
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes23.08205-
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x9x-

AMBER is measured by running multiple independent instances using MPS


Chroma

Physics

Lattice Quantum Chromodynamics (LQCD)

VERSION

V2024.10

ACCELERATED FEATURES

  • Wilson-clover fermions, Krylov solvers, Domain-decomposition
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
ChromaFinal Timestep Time (Sec)HMC Mediumno9,24016461
ChromaNRFHMC Mediumyes1x58x155x

FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.0.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no1272410
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x7x17x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno1793613
Fun3D [waverider-5M]NRFwaverider-5Myes1x8x21x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno49810538
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x7x19x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno682-48
Fun3D [waverider-20M]NRFwaverider-20Myes1x-19x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,155-138
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x-23x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

2024.3

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
GROMACS [ADH Dodec]ns/dayADH Dodecyes3708343,293
GROMACS [ADH Dodec]NRFADH Dodecyes1x2x9x
GROMACS [STMV]ns/daySTMVyes1947120
GROMACS [STMV]NRFSTMVyes1x2x8x

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
GTCMpush/Secmpi#proc.inyes1368122,874
GTCNRFmpi#proc.inyes1x6x22x

ICON

Weather and Climate

A global unified atmosphere model for numerical weather prediction and climate modeling research

VERSION

2024.8_RC

ACCELERATED FEATURES

  • Full model of dynamics and physics

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://code.mpimet.mpg.de/projects/iconpublic

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
ICON [SLAM 191 - 160KM - no radiation]Integrate_nh (sec)SLAM 191 levels 160 km resolution without radiationno575175108
ICON [SLAM 191 - 160KM - no radiation]NRFSLAM 191 levels 160 km resolution without radiationyes1x3x5x
ICON [QUBICC 160 km resolution]Integrate_nh (sec)QUBICC 160 km resolutionno45914781
ICON [QUBICC 160 km resolution]NRFQUBICC 160 km resolutionyes1x3x6x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

stable_29Aug2024

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB
LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.28E+081.56E+09
LAMMPS [LJ 2.5]NRFLJ 2.5yes1x5x
LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.33E+086.10E+08
LAMMPS [EAM]NRFEAMyes1x5x
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.84E+061.14E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x9x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.53E+063.83E+06
LAMMPS [SNAP]NRFSNAPyes1x3x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes1.99E+081.08E+09
LAMMPS [Tersoff]NRFTersoffyes1x6x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_cde2498

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
MILCTotal Time (sec)Apex Mediumno16,570935306
MILCNRFApex Mediumyes1x16x48x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

Application Metric Test Modules Bigger is better AMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
NAMD [LaINDY ColVars] ns/day LaINDY ColVars yes 44.89 114 441
NAMD [LaINDY ColVars] NRF LaINDY ColVars yes 1x 3x 10x
NAMD [apoa1_nve_cuda] ns/day apoa1_nve_cuda yes 97.16 392 1,505
NAMD [apoa1_nve_cuda] NRF apoa1_nve_cuda yes 1x 4x 15x
NAMD [stmv_npt_cuda] ns/day stmv_npt_cuda yes 10.06 26 102
NAMD [stmv_npt_cuda] NRF stmv_npt_cuda yes 1x 3x 10x
NAMD [COVID-19 Spike Assembly] ns/day COVID-19 Spike Assembly yes 0.78 3 11
NAMD [COVID-19 Spike Assembly] NRF COVID-19 Spike Assembly yes 1x 4x 14x
NAMD [stmv_nve_cuda] ns/day stmv_nve_cuda yes 10.49 32 126
NAMD [stmv_nve_cuda] NRF stmv_nve_cuda yes 1x 3x 12x

NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


RTM

Geoscience

Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

VERSION

nvidia_2024_01

ACCELERATED FEATURES

  • Batch algorithm

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

http://www.tsunamidevelopment.com/assets/rtm.pdf

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,047178,321708,595
RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x8x34x
RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21331,584124,223
RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x4x17x
RTM [TTI RX 2Pass mgpu]Mcell/sTTI RX 2Pass mgpuyes7,21329,320115,804
RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x4x16x

SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

4.1.1

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)1x GH200 96GB4x GH200 96GB
SPECFEM3DTotal Time (Sec)four_material_simple_modelno1994113
SPECFEM3DNRFfour_material_simple_modelyes1x4x18x


Detailed H100 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

24-AT_24

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.713086161,2622,4762815551,1092,456
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x26x53x108x211x24x47x95x210x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.693146291,2692,5952855631,1252,367
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x27x54x109x222x24x48x96x202x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.361,3352,6645,39711,2951,2362,4544,8989,766
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x14x29x58x121x13x26x52x105x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.501,3652,7405,60611,8401,2542,5135,2469,974
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x14x28x56x119x13x25x53x100x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.044,5739,28618,51536,0904,2398,45317,80432,754
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x12x25x49x96x11x22x47x87x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.044,7299,39519,26538,1194,2938,52817,02933,107
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x12x24x49x96x11x21x43x83x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.698917835771392184368736
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x24x48x97x193x25x50x100x199x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.071933867711,5431813627231,446
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x15x31x62x7x14x29x58x

AMBER is measured by running multiple independent instances using MPS


Chroma

Physics

Lattice Quantum Chromodynamics (LQCD)

VERSION

V2025.01

ACCELERATED FEATURES

  • Wilson-clover fermions, Krylov solvers, Domain-decomposition
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
ChromaFinal Timestep Time (Sec)HMC Mediumno10,03726110663401901096849
ChromaNRFHMC Mediumyes1x38x96x164x256x53x94x151x209x

FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
Fun3D [dpw_wbt0_crs-3.6Mn_5]Loop Time (Sec)dpw_wbt0_crs-3.6Mn_5no112271610829171010
Fun3D [dpw_wbt0_crs-3.6Mn_5]NRFdpw_wbt0_crs-3.6Mn_5yes1x4x9x15x19x4x9x14x15x
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno154382112840221210
Fun3D [waverider-5M]NRFwaverider-5Myes1x4x12x20x29x4x11x20x25x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno474104542918110583020
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x5x13x24x40x4x12x23x35x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno628--4123--4326
Fun3D [waverider-20M]NRFwaverider-20Myes1x--20x37x--19x33x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,011--11661--12568
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x--25x48x--24x43x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

h-bond - 2025-rc

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
GROMACS [ADH Dodec]ns/dayADH Dodecyes3628231,5402,7005,2957671,4322,6255,326
GROMACS [ADH Dodec]NRFADH Dodecyes1x2x4x7x15x2x4x7x15x
GROMACS [STMV]ns/daySTMVyes2044751302004170121144
GROMACS [STMV]NRFSTMVyes1x2x4x8x13x2x3x7x9x

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
GTCMpush/Secmpi#proc.inyes1467691,4362,8055,2357411,3962,6794,819
GTCNRFmpi#proc.inyes1x5x10x20x38x5x10x20x35x

LAMMPS

Molecular Dynamics

Classical molecular dynamics package

VERSION

patch_4Feb2025

ACCELERATED FEATURES

  • Lennard-Jones, Gay-Berne, Tersoff, many more potentials
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
LAMMPS [LJ 2.5]ATOM-Time Steps/sLJ 2.5yes3.95E+081.33E+092.47E+094.38E+097.42E+091.16E+091.90E+093.39E+096.04E+09
LAMMPS [LJ 2.5]NRFLJ 2.5yes1x3x6x12x20x3x5x9x16x
LAMMPS [EAM]ATOM-Time Steps/sEAMyes1.44E+085.34E+081.02E+091.82E+093.02E+095.10E+088.51E+081.49E+092.49E+09
LAMMPS [EAM]NRFEAMyes1x4x7x13x22x4x6x11x18x
LAMMPS [ReaxFF/C]ATOM-Time Steps/sReaxFF/Cyes1.93E+061.07E+071.93E+073.15E+074.77E+079.49E+061.72E+072.89E+074.23E+07
LAMMPS [ReaxFF/C]NRFReaxFF/Cyes1x6x15x24x37x5x13x22x33x
LAMMPS [SNAP]ATOM-Time Steps/sSNAPyes1.61E+064.16E+068.35E+061.65E+073.29E+073.65E+066.37E+061.20E+072.68E+07
LAMMPS [SNAP]NRFSNAPyes1x3x7x12x24x2x5x9x19x
LAMMPS [Tersoff]ATOM-Time Steps/sTersoffyes2.21E+081.00E+091.79E+093.35E+095.69E+098.68E+081.49E+092.84E+09-
LAMMPS [Tersoff]NRFTersoffyes1x5x10x18x30x4x7x15x-

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_cde2498

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
MILCTotal Time (sec)Apex Mediumno13,7351,1736323562161,212679373266
MILCNRFApex Mediumyes1x12x19x34x57x11x18x33x46x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes100.722995961,1812,3002735501,1062,209
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x3x6x12x23x3x5x11x22x
NAMD [LaINDY ColVars]ns/dayLaINDY ColVarsyes50.568717434668984162325646
NAMD [LaINDY ColVars]NRFLaINDY ColVarsyes1x2x3x7x14x2x3x6x13x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes108.793817571,4942,9353537061,4122,737
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x4x7x14x27x3x6x13x25x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes10.53244997196234692184
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x5x9x19x2x4x9x18x
NAMD [COVID-19 Spike Assembly]ns/dayCOVID-19 Spike Assemblyyes0.75361118358-
NAMD [COVID-19 Spike Assembly]NRFCOVID-19 Spike Assemblyyes1x4x8x14x24x4x6x10x-
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes10.8731621232472957114227
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x3x6x11x23x3x5x10x21x

NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


RELION

Microscopy

Stand-alone computer program that employs an empirical Bayesianapproach to refinement of (multiple) 3D reconstructions or 2D class averages in electron cryo-microscopy (cryo-EM)

VERSION

5.0.0

ACCELERATED FEATURES

  • Reduced memory requirements; high-resolution cryo-EM structure determination in a matter of day on a single workstation
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL
Relion [Plasmodium Ribosome]Total Wall Clock (Sec)MB numbers Plasmodium Ribosime on Relion-3.0no8,9812,1371,2881,0591,0052,4581,2191,035
Relion [Plasmodium Ribosome]NRFMB numbers Plasmodium Ribosime on Relion-3.0yes1x4x7x8x9x4x7x9x

RTM

Geoscience

Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

VERSION

nvidia_2024_01

ACCELERATED FEATURES

  • Batch algorithm

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

http://www.tsunamidevelopment.com/assets/rtm.pdf

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,047157,252313,545625,2421,250,439153,662292,630589,5621,214,770
RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x7x15x30x59x7x14x28x58x
RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21330,82461,529122,504244,24625,59749,60794,039197,162
RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x4x9x17x34x4x7x13x27x
RTM [TTI RX 2Pass mgpu]Mcell/sTTI RX 2Pass mgpuyes7,21326,71153,090105,394210,08623,97846,00192,576186,835
RTM [TTI RX 2Pass mgpu]NRFTTI RX 2Pass mgpuyes1x4x7x15x29x3x6x13x26x

SPECFEM3D

Geoscience

Simulates Seismic wave propagation

VERSION

4.1.1

ACCELERATED FEATURES

  • OpenCL and CUDA hardware accelerators, based on an automatic source-to-source transformation library

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://geodynamics.org/cig/software/specfem3d/

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x H100 SXM2x H100 SXM4x H100 SXM8x H100 SXM1x H100 NVL2x H100 NVL4x H100 NVL8x H100 NVL
SPECFEM3DTotal Time (Sec)four_material_simple_modelno186462414105026149
SPECFEM3DNRFfour_material_simple_modelyes1x4x9x16x22x4x6x15x24x


Detailed L40S application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

24-AT_24

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.711793567281,582
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x15x30x62x135x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.691833727391,580
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x16x32x63x135x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.369772,0044,0178,935
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x10x21x43x96x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.501,0202,0604,1669,026
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x10x21x42x91x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.044,1508,38917,11235,769
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x11x22x45x95x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.044,2408,70617,762-
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x11x22x45x-
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.6974148296592
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x20x40x80x160x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.071943887761,552
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x8x15x31x62x

AMBER is measured by running multiple independent instances using MPS


Chroma

Physics

Lattice Quantum Chromodynamics (LQCD)

VERSION

V2025.01

ACCELERATED FEATURES

  • Wilson-clover fermions, Krylov solvers, Domain-decomposition
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)2x L40S4x L40S8x L40S
ChromaFinal Timestep Time (Sec)HMC Mediumno10,037367343152
ChromaNRFHMC Mediumyes1x28x30x67x

FUN3D

Engineering

Suite of tools actively developed at NASA for Aeronautics and Space Technology by modeling fluid flow

VERSION

14.1

ACCELERATED FEATURES

  • Full range of Mach number regimes for the Reynolds-averaged Navier Stokes (RANS) formulation

SCALABILITY

Multi-GPU and Single-Node

MORE INFORMATION

https://fun3d.larc.nasa.gov

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)4x L40S8x L40S
Fun3D [waverider-5M]Loop Time (Sec)waverider-5Mno1544123
Fun3D [waverider-5M]NRFwaverider-5Myes1x6x10x
Fun3D [waverider-5M w/chemistry]Loop Time (Sec)waverider-5M w/chemistryno47410557
Fun3D [waverider-5M w/chemistry]NRFwaverider-5M w/chemistryyes1x7x12x
Fun3D [waverider-20M]Loop Time (Sec)waverider-20Mno62816589
Fun3D [waverider-20M]NRFwaverider-20Myes1x5x9x
Fun3D [waverider-20M w/chemistry]Loop Time (Sec)waverider-20M w/chemistryno2,011-237
Fun3D [waverider-20M w/chemistry]NRFwaverider-20M w/chemistryyes1x-12x

GROMACS

Molecular Dynamics

Simulation of biochemical molecules with complicated bond interactions

VERSION

h-bond - 2025-rc

ACCELERATED FEATURES

  • Implicit (5x), Explicit (2x) Solvent
ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
GROMACS [ADH Dodec]ns/dayADH Dodecyes3626401,3532,7125,520
GROMACS [ADH Dodec]NRFADH Dodecyes1x2x4x7x15x
GROMACS [STMV]ns/daySTMVyes204473113-
GROMACS [STMV]NRFSTMVyes1x2x4x6x-

GROMACS [ADH Dodec] is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
GTCMpush/Secmpi#proc.inyes1464397261,5833,007
GTCNRFmpi#proc.inyes1x3x5x12x22x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_cde2498

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S
MILCTotal Time (sec)Apex Mediumno13,7354,0462,0471,438
MILCNRFApex Mediumyes1x3x6x8x

NAMD

Molecular Dynamics

Designed for high-performance simulation of large molecular systems

VERSION

3

ACCELERATED FEATURES

  • Full electrostatics with PME and most simulation features

SCALABILITY

Up to 100M atom capable, multi-GPU, single node

MORE INFORMATION

http://www.ks.uiuc.edu/Research/namd/

https://ngc.nvidia.com/catalog/containers/hpc:namd

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
NAMD [apoa1_npt_cuda]ns/dayapoa1_npt_cudayes100.722304579001,816
NAMD [apoa1_npt_cuda]NRFapoa1_npt_cudayes1x2x5x9x18x
NAMD [LaINDY ColVars]ns/dayLaINDY ColVarsyes50.5662125248496
NAMD [LaINDY ColVars]NRFLaINDY ColVarsyes1x1x2x5x10x
NAMD [apoa1_nve_cuda]ns/dayapoa1_nve_cudayes108.793005971,2002,354
NAMD [apoa1_nve_cuda]NRFapoa1_nve_cudayes1x3x5x11x22x
NAMD [stmv_npt_cuda]ns/daystmv_npt_cudayes10.53173468136
NAMD [stmv_npt_cuda]NRFstmv_npt_cudayes1x2x3x6x13x
NAMD [stmv_nve_cuda]ns/daystmv_nve_cudayes10.87234692183
NAMD [stmv_nve_cuda]NRFstmv_nve_cudayes1x2x4x8x17x

NAMD is measured by running multiple independent instances using MPS except NAMD [COVID-19 Spike Assembly] dataset
Trifan A, Gorgun D, Salim M, et al. Intelligent resolution: Integrating Cryo-EM with AI-driven multi-resolution simulations to observe the severe acute respiratory syndrome coronavirus-2 replication-transcription machinery in action. The International Journal of High Performance Computing Applications. 2022;36(5-6):603-623. doi:10.1177/10943420221113513
D. B. Sauer, N. Trebesch, J. J. Marden, N. Cocco, J. Song, A. Koide, S. Koide, E. Tajkhorshid, and D.-N. Wang. "Structural basis for the reaction cycle of DASS dicarboxylate transporters." eLife. 9, e61350 (2020). https://doi.org/10.7554/eLife.61350


RTM

Geoscience

Reverse time migration (RTM) modeling is a critical component in the seismic processing workflow of oil and gas exploration

VERSION

nvidia_2024_01

ACCELERATED FEATURES

  • Batch algorithm

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

http://www.tsunamidevelopment.com/assets/rtm.pdf

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L40S2x L40S4x L40S8x L40S
RTM [Isotropic Radius 4]Mcell/sIsotropic Radius 4yes21,04742,36684,432168,028336,068
RTM [Isotropic Radius 4]NRFIsotropic Radius 4yes1x2x4x8x16x
RTM [TTI Radius 8 1-pass]Mcell/sTTI Radius 8 1-passyes7,21314,64428,93757,176114,205
RTM [TTI Radius 8 1-pass]NRFTTI Radius 8 1-passyes1x2x4x8x16x


Detailed L4 application performance data is located below in alphabetical order.

AMBER

Molecular Dynamics

Suite of programs to simulate molecular dynamics on biomolecule

VERSION

24-AT_24

ACCELERATED FEATURES

  • PMEMD Explicit Solvent and GB Implicit Solvent

SCALABILITY

Multi-GPU and Single Node

MORE INFORMATION

http://ambermd.org/GPUSupport.php

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9684X (CPU-Only)1x L4 2x L44x L48x L4
AMBER [PME-Cellulose_NPT_4fs]ns/dayDC-Cellulose_NPTyes11.7155109220440
AMBER [PME-Cellulose_NPT_4fs]NRFDC-Cellulose_NPTyes1x5x9x19x38x
AMBER [PME-Cellulose_NVE_4fs]ns/dayDC-Cellulose_NVEyes11.6956111220442
AMBER [PME-Cellulose_NVE_4fs]NRFDC-Cellulose_NVEyes1x5x10x19x38x
AMBER [PME-FactorIX_NPT_4fs]ns/dayDC-FactorIX_NPTyes93.362665361,0652,145
AMBER [PME-FactorIX_NPT_4fs]NRFDC-FactorIX_NPTyes1x3x6x11x23x
AMBER [PME-FactorIX_NVE_4fs]ns/dayDC-FactorIX_NVEyes99.502725441,0932,231
AMBER [PME-FactorIX_NVE_4fs]NRFDC-FactorIX_NVEyes1x3x5x11x22x
AMBER [PME-JAC_NPT_4fs]ns/dayDC-JAC_NPTyes377.041,2812,5195,14410,383
AMBER [PME-JAC_NPT_4fs]NRFDC-JAC_NPTyes1x3x7x14x28x
AMBER [PME-JAC_NVE_4fs]ns/dayDC-JAC_NVEyes397.041,2802,5675,17610,395
AMBER [PME-JAC_NVE_4fs]NRFDC-JAC_NVEyes1x3x6x13x26x
AMBER [PME-STMV_NPT_4fs]ns/dayDC-STMV_NPTyes3.69214183166
AMBER [PME-STMV_NPT_4fs]NRFDC-STMV_NPTyes1x6x11x22x45x
AMBER [FEP-GTI_Complex 1fs]ns/dayFEP-GTI_Complexyes25.07113226451902
AMBER [FEP-GTI_Complex 1fs]NRFFEP-GTI_Complexyes1x4x9x18x36x

AMBER is measured by running multiple independent instances using MPS


GTC

Physics

GTC is used for Gyrokinetic Particle Simulation of Turbulent Transport in Burning Plasmas

VERSION

V4.5 updated

ACCELERATED FEATURES

  • Push, shift, and collision

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)4x L48x L4
GTCMpush/Secmpi#proc.inyes1366571,244
GTCNRFmpi#proc.inyes1x5x10x

MILC

Physics

Lattice Quantum Chromodynamics (LQCD) codes simulate how elemental particles are formed and bound by the “strong force” to create larger particles like protons and neutrons

VERSION

develop_cde2498

ACCELERATED FEATURES

  • Staggered fermions, Krylov solvers, Gauge-link fattening

SCALABILITY

Multi-GPU and Multi-Node

MORE INFORMATION

https://ngc.nvidia.com/catalog/containers/hpc:milc

ApplicationMetricTest ModulesBigger is betterAMD Dual Genoa 9654 (CPU-Only)2x L44x L48x L4
MILCTotal Time (sec)Apex Mediumno16,5705,8733,0001,618
MILCNRFApex Mediumyes1x3x5x9x