Coming Soon
Be the first to know when the new architecture becomes available.
Explore the groundbreaking advancements the NVIDIA Blackwell architecture brings to generative AI and accelerated computing. Building upon generations of NVIDIA technologies, Blackwell defines the next chapter in generative AI with unparalleled performance, efficiency, and scale.
Blackwell-architecture GPUs pack 208 billion transistors and are manufactured using a custom-built TSMC 4NP process. All Blackwell products feature two reticle-limited dies connected by a 10 terabytes per second (TB/s) chip-to-chip interconnect in a unified single GPU.
The second-generation Transformer Engine uses custom Blackwell Tensor Core technology combined with NVIDIA® TensorRT™-LLM and NeMo™ Framework innovations to accelerate inference and training for large language models (LLMs) and Mixture-of-Experts (MoE) models.
Blackwell includes NVIDIA Confidential Computing, which protects sensitive data and AI models from unauthorized access with strong hardware-based security. Blackwell is the first TEE-I/O capable GPU in the industry. It provides the most performant confidential compute solution with TEE-I/O capable hosts and inline protection over NVIDIA® NVLink™.
The fifth generation of NVIDIA NVLink interconnect can scale up to 576 GPUs to unleash accelerated performance for trillion- and multi-trillion-parameter AI models.
The NVIDIA NVLink Switch Chip enables 130TB/s of GPU bandwidth in one 72-GPU NVLink domain (NVL72). It delivers 4X bandwidth efficiency with NVIDIA Scalable Hierarchical Aggregation and Reduction Protocol (SHARP)™ FP8 support.
Blackwell’s Decompression Engine accesses massive amounts of memory in the NVIDIA Grace™ CPU over a high-speed link—900 gigabytes per second (GB/s) of bidirectional bandwidth—and accelerates the full pipeline of database queries. It delivers the highest performance in data analytics and data science with support for the latest compression formats, such as LZ4, Snappy, and Deflate.
Blackwell adds intelligent resiliency with a dedicated Reliability, Availability, and Serviceability (RAS) Engine. This system identifies potential faults that may occur early on to minimize downtime. This builds intelligent resilience that saves time, energy, and computing costs.
NVIDIA Privacy Policy