Power efficiency refers to a compute resource’s ability to convert electrical power into useful work with minimal waste or loss. It’s typically measured in tasks per watt (or watts per task) and is increasingly important for coping with power-limited data centers and achieving sustainable computing.
The more useful work a computing environment can accomplish for a given rate of electricity, the better the power efficiency. Increasing the energy efficiency of the compute equipment—so it accomplishes more work per unit of energy consumed—also improves overall power efficiency.
Typical U.S. data center energy use breakdown in 2014, with 57 percent of power used for IT equipment and 43 percent used for cooling, power distribution, lighting, and other purposes.
Power efficiency can be improved by decreasing the power usage effectiveness (PUE) ratio, so more of the electricity going into the data center is used for computing while less of it is used for cooling or lost in the power distribution infrastructure. It can also be improved by making servers more energy -efficient with purpose-built accelerators such as GPUs and DPUs, which accomplish specific tasks more efficiently than general- purpose CPUs.
Growing computing clusters demand more electrical power to run and cool equipment, power that generates additional greenhouse gases (GHG), increases costs, and often exceeds what’s available in data centers.
Using accelerator technology to improve server efficiency and increase power distribution and cooling efficiencies can significantly reduce power consumption for data centers. This lowers operating costs, allows more compute power in data centers, and lowers GHG emissions.
Power efficiency gains come from making servers and networking more energy efficient and by improving the PUE of data centers.
Combining these solutions greatly decreases the amount of electricity consumed for each application or computing task, increasing power efficiency.
NVIDIA GPUs can process hundreds of threads in parallel and perform many math and graphics tasks much more efficiently than general-purpose CPUs. Shifting highly parallel and/or math- and graphics-intensive workloads to GPUs lets them run an order-of-magnitude faster, completing them more quickly and with less energy. In addition, NVIDIA AI frameworks improve energy efficiency even further when shifting workloads from CPU to GPU. The combination of NVIDIA GPUs and AI, high-performance computing (HPC), or visualization software delivers huge gains in power efficiency to data centers.
The NVIDIA BlueField DPU offloads, accelerates, and isolates infrastructure workloads from the CPU, improving performance and power efficiency. BlueField shifts networking, storage, security, and management tasks to purpose-built silicon, performing them more efficiently than general-purpose CPUs and freeing up CPU cores to run business and scientific applications.
The NVIDIA Grace CPU delivers superior power efficiency for AI and scientific computing tasks, and the Grace CPU also uses LPDDR5X memory to deliver up to 2X more bandwidth and 10X better energy efficiency than the previous generation of server memory. For traditional computing tasks, newer x86 CPUs from AMD and Intel are more energy efficient than older x86 CPUs.
Using more efficient interconnects between CPUs, GPUs, and memory significantly improves power efficiency within the server. NVIDIA NVLink and NVSwitch connect GPUs with up to 7X higher bandwidth and several times better energy efficiency than PCie Gen5. NVIDIA Quantum-2 InfiniBand with in-network computing connects AI and HPC clusters with the best possible performance and efficiency by performing compute tasks in the network and reducing the number of switches required. NVIDIA Spectrum™ switches deliver the most efficient 200G/400G/800G Ethernet networks for AI. NVIDIA LinkX® cables and transceivers with ConnectX® adapters and BlueField DPUs support direct drive to reduce power consumption on each transceiver.
The NVIDIA H100 Tensor Core GPU demonstrates almost 2X the energy efficiency of the previous NVIDIA A100 Tensor Core GPU.
NVIDIA DGX™ A100 systems deliver a nearly 5X improvement in energy efficiency for AI training applications compared to the previous generation of DGX.
As of November 2022, NVIDIA GPU and networking technologies power 23 of the top 30 supercomputing systems on the Green500 list, including the #1 Green500 system.
The NVIDIA Grace CPU delivers up to 2X better energy efficiency than x86 CPUs for selected applications.
NVIDIA BlueField DPUs can help servers consume up to 30 percent less power per unit of work.
When running the Redis in-memory caching service on VMware vSphere 8, offloading networking to a BlueField DPU can reduce power consumption per task by up to 34 percent.
NVIDIA GeForce RTX™ 40 series laptops, with the NVIDIA Ada Lovelace GPU architecture and fifth-generation Max-Q technology, are up to 3X more power efficient than the previous generation.
Here are some ways you can start improving power efficiency in your data center: