NVIDIA Vera Rubin Platform

NVIDIA Vera Rubin Platform

Shaping the next generation of AI.

Overview

Driving the Era of Agentic AI

The NVIDIA Vera Rubin platform is built for the age of agentic AI and reasoning, engineered to master multi-step problem-solving and massive long-context workflows at scale. By eliminating critical bottlenecks in communication and memory movement, the platform supercharges inference to deliver more tokens per watt and lower cost per token versus the NVIDIA Blackwell architecture generation.

NVIDIA Kicks Off the Next Generation of AI With Rubin

Introducing the NVIDIA Vera Rubin platform. Seven new chips, one incredible AI supercomputer.

NVIDIA Vera Rubin Opens Agentic AI Frontier

The NVIDIA Vera Rubin platform includes seven new chips in full production to scale the world’s largest AI factories.

Look Inside the Technological Breakthroughs

Transformer Engine

The Rubin GPU features a new Transformer Engine (TE) with hardware-accelerated adaptive compression to boost NVFP4 performance while preserving accuracy. This enables up to 50 petaFLOPS of NVFP4 inference. Fully compatible with NVIDIA Blackwell, the Transformer Engine ensures seamless upgrades, so previously optimized codes transition effortlessly to the Vera Rubin platform.

Third-Generation Confidential Computing

The third generation of NVIDIA Confidential Computing expands security to full-rack scale with NVIDIA Vera Rubin NVL72. This platform creates a unified, trusted execution environment across all 36 NVIDIA Vera CPUs, 72 NVIDIA Rubin GPUs, and the NVIDIA NVLink™ fabric that seamlessly connects them. The platform maintains data security across CPU, GPU, and NVLink domains. With attestation services for cryptographic proof of compliance, it combines massive scale with uncompromised protection, all to protect the world’s largest proprietary models, training data, and inference workloads.

Sixth-Generation NVLink and NVLink Switch

The sixth-generation NVLink delivers a major leap for NVIDIA's high-speed GPU interconnect fabric that unifies 72 NVIDIA Rubin GPUs into a single performance domain. Doubling NVIDIA Blackwell’s performance, the Rubin GPU delivers 3.6 terabytes per second (TB/s) of bandwidth per GPU and 260 TB/s of connectivity with low latency to facilitate faster communication. Combined with NVIDIA® Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ that reduces network congestion by up to 50 percent for collective operations, this next-generation interconnect accelerates training and inference for the world’s largest models, at scale and without compromise.

Second-Generation Reliability, Availability, and Serviceability (RAS) Engine

The NVIDIA Vera Rubin platform delivers rack-scale resiliency with advanced reliability features. NVIDIA Rubin GPUs feature a dedicated second-generation RAS engine for proactive maintenance and real-time health checks without downtime. NVIDIA Vera CPUs add enhanced serviceability with small-outline compression-attached memory modules (SOCAMM) LPDDR5X and in-system tests for the CPU cores. The rack introduces modular, cable-free tray designs for 18x faster assembly and serviceability versus NVIDIA Blackwell, combined with intelligent resiliency and software-defined NVLink routing, which ensures continuous operation and reduces maintenance overhead.

NVIDIA Vera CPU

The NVIDIA Vera CPU is engineered for data movement and agentic reasoning across accelerated systems, with full confidential computing support. It pairs seamlessly with NVIDIA GPUs or operates independently for analytics, cloud, orchestration, storage, and high-performance computing (HPC) workloads. Vera combines 88 NVIDIA-designed cores, up to 1.2 TB/s of LPDDR5X memory bandwidth, and NVIDIA Scalable Coherency Fabric to deliver predictable, energy-efficient performance for data- and memory-intensive workloads with full Arm® compatibility. Integrated NVIDIA NVLink-C2C connectivity enables high-bandwidth, coherent CPU–GPU memory access to maximize system utilization and efficiency.

Explore NVIDIA Vera Rubin Products

NVIDIA Vera Rubin NVL72

NVIDIA Vera Rubin NVL72 unifies 72 NVIDIA Rubin GPUs, 36 NVIDIA Vera CPUs, NVIDIA ConnectX®-9 SuperNIC™ cards, and NVIDIA BlueField®-4 DPUs, and sits alongside NVIDIA LPX racks in a data center for fast, low-latency inference. It scales up intelligence in a rack-scale platform with the sixth-generation NVLink and NVLink Switch and scales out with NVIDIA Quantum-X800 InfiniBand and NVIDIA Spectrum-X™ Ethernet to power the AI industrial revolution at scale.

NVIDIA Groq 3 LPX

NVIDIA Groq 3 LPX is the inference accelerator for NVIDIA Vera Rubin, designed to meet the low-latency and large-context demands of agentic systems. Vera Rubin and LPX unite the extreme performance of NVIDIA Rubin GPUs and LPUs through a co-designed architecture. LPX features 256 LPUs with 128 GB SRAM, 40 PB/s memory bandwidth, and 640 TB/s scale-up bandwidth per rack.

NVIDIA DGX Vera Rubin NVL72

NVIDIA DGX Vera Rubin NVL72 provides enterprises with a turnkey, ready-to-deploy AI infrastructure solution built upon the NVIDIA Vera Rubin platform. It’s purpose-built to be deployed at scale to accelerate the most complex AI models.

NVIDIA HGX Rubin NVL8

The NVIDIA HGX™ Rubin NVL8 integrates eight NVIDIA Rubin GPUs with sixth-generation high-speed NVLink interconnects to propel the data center into a new era of accelerated computing and generative AI. NVIDIA HGX Rubin NVL8 can be paired with either NVIDIA Vera CPUs or x86-based CPU baseboards.

NVIDIA DGX Rubin NVL8

NVIDIA DGX Rubin NVL8 is a liquid-cooled AI system powered by eight NVIDIA Rubin GPUs and sixth-generation NVLink. It’s purpose-built to accelerate training, inference, and post-training for every AI workload.

Inside the NVIDIA Vera Rubin Platform

Read this technical deep dive to learn how NVIDIA Vera Rubin treats the data center as the unit of compute, not the chip, establishing a new foundation for producing intelligence efficiently, securely, and predictably at scale.