Jensen Huang Announces Vera Rubin: Next-Generation AI Supercomputing Platform

2026-03-01 07:17:49

At CES 2026, NVIDIA CEO Jensen Huang made a major breakthrough for the entire AI industry. Instead of a consumer graphics card like in previous years, he introduced a comprehensive computing platform called Vera Rubin—named after the famous astronomer who discovered dark matter. The appearance of Vera Rubin marks a strategic shift for NVIDIA: from merely selling chips to building an entire supercomputing ecosystem.

Huang brought a 2.5-ton AI server onto stage to showcase this vision. This is a significant change: instead of selling individual components, NVIDIA now focuses on solving large-scale AI computing problems holistically.

Six Integrated Chip Designs: Vera Rubin Breaks Traditional Design Rules

According to Huang, Vera Rubin breaks an internal NVIDIA rule: each product generation typically changes only 1-2 chips. This time, the company redesigned six different types of chips, all now in mass production.

The reason is clear: Moore’s Law is slowing down, but AI computing demand is increasing tenfold every year. To keep pace, NVIDIA adopted an “extreme co-design” strategy—optimizing every level of the chips and the entire platform simultaneously.

These six chips include:

Vera CPU: An 88-core Olympus custom processor supporting 176 threads via NVIDIA’s multi-threading technology. System memory reaches 1.5 TB (3x Grace), with LPDDR5X bandwidth of 1.2 TB/s. Total transistor count: 227 billion.

Rubin GPU: NVFP4 inference power reaches 50 PFLOPS—5 times the previous Blackwell architecture. The chip contains 336 billion transistors (1.6x increase), with a third-generation Transformer engine capable of dynamic precision adjustment.

ConnectX-9 Network Card: Supports 800 Gb/s Ethernet with programmable RDMA. Contains 23 billion transistors and is certified for CNSA/FIPS security.

BlueField-4 DPU: A dedicated data processing engine for AI storage, integrating 64 Grace CPU cores with ConnectX-9 connectivity. Contains 126 billion transistors.

NVLink-6 Switch Chip: Connects 18 compute nodes, supporting up to 72 Rubin GPUs as a single unit. Each GPU gets 3.6 TB/s all-to-all communication bandwidth.

Spectrum-6 Optical Ethernet Switch: 512 channels, each 200 Gbps, with TSMC COOP silicon photonics technology. Contains 352 billion transistors.

Explosive Performance Gains, 90% Cost Reduction: Vera Rubin Changes the Economics of AI

Integrating these six chips, the Vera Rubin NVL72 system delivers impressive numbers. For inference tasks with NVFP4, compute power reaches 3.6 EFLOPS—5x that of Blackwell. For training, performance hits 2.5 EFLOPS, a 3.5x increase.

Memory capacity includes 54 TB of LPDDR5X (3x previous generation) and 20.7 TB of HBM, 1.5x more. HBM4 bandwidth hits 1.6 PB/s (2.8x increase), while Scale-Up bandwidth doubles to 260 TB/s.

Huang highlighted a particularly striking figure: training a 100 trillion parameter model with Rubin requires only 1/4 of the Blackwell system. The cost to generate a single token is just 1/10 of Blackwell.

Although Rubin consumes twice the power of Grace Blackwell, its performance far exceeds the increase in power consumption. More importantly, throughput (tokens completed per watt-dollar) increases tenfold. For a gigawatt data center costing $50 billion, this effectively doubles revenue-generating capacity.

A major industry challenge—insufficient context memory for KV Cache (key-value cache)—is solved by deploying BlueField-4 to manage KV Cache separately. Each node has 4 BlueField-4s, each with an additional 150 TB of context memory, providing 16 TB per GPU—while maintaining 200 Gbps bandwidth, with no speed reduction.

To support this architecture, the network must be large, fast, and reliable. Spectrum-X was developed to meet this need—an end-to-end Ethernet platform “dedicated to AI creation.” With 512 channels × 200 Gbps, it boosts throughput by 25%, saving $5 billion on a $50 billion data center. “This network system is almost free,” Huang said.

From Language Models to the Physical World: Huang Expands NVIDIA’s Vision

If large language models address the “digital world,” NVIDIA’s next goal under Huang’s leadership is to conquer the “physical world.” He emphasized that real-world physical law understanding requires extremely scarce real data.

Huang proposed a “three-computer core” architecture for Physical AI:

Training Computer: Built from familiar training GPUs (like Vera Rubin), responsible for training models.

Inference Computer: The “cerebellum” placed at the edge of robots or cars, responsible for real-time execution.

Simulation Computer: Comprising Omniverse and Cosmos, providing virtual training environments that help AI learn physical responses through simulation.

Cosmos can generate vast numbers of training environments for physical world AI, opening up entirely new possibilities.

Alpamayo and Artificial Reasoning: NVIDIA Conquers the Real World

Based on this three-computer architecture, Huang officially announced Alpamayo—the world’s first autonomous driving model with genuine reasoning and inference capabilities.

Unlike traditional autonomous driving relying on rigid code, Alpamayo is an end-to-end trained system. Its breakthrough is solving the “long tail” problem—handling complex, previously unseen traffic scenarios. Alpamayo not only executes but also reasons like a human driver.

“It will tell you what it’s about to do and why,” Huang explained. In demos, the autonomous vehicle demonstrated the ability to break down extremely complex situations into basic common-sense knowledge for processing.

Mercedes CLA equipped with Alpamayo technology will launch in the US in Q1 2026 (current timeline), then expand to Europe and Asia. The vehicle is rated as the safest car in the world by NCAP, thanks to NVIDIA’s unique “dual safety stack” design. When the AI model lacks confidence, the system switches to traditional safety mode, ensuring absolute safety.

Robots and Intelligent Infrastructure: NVIDIA’s Comprehensive Strategy

During the event, Huang also showcased NVIDIA’s comprehensive robot strategy. All robots will be equipped with Jetson mini-computers, trained in Isaac Simulator on the Omniverse platform. NVIDIA is integrating this technology into industrial ecosystems with partners like Synopsys, Cadence, Siemens.

NVIDIA’s bottom-up vision: in the future, chip design, system design, even factory simulation will be accelerated by NVIDIA’s Physical AI. Huang told Disney’s adorable robots: “You will be designed in the computer, manufactured in the computer, and even tested and verified in the computer before facing gravity.”

Open-Source Acceleration Models: NVIDIA Embraces Industry-Wide AI Trends

A key part of Huang’s strategy is valuing the open-source community. He emphasized that DeepSeek V1 surprised the entire industry; as the first open-source inference system, it directly spurred a wave of development across the sector.

While current open models may lag top-tier models by about six months, new models emerge every six months. This rapid iteration cycle keeps startups, giants, and researchers eager not to miss out.

NVIDIA is no exception. They built the multi-billion-dollar DGX Cloud supercomputer, developing advanced models like La Proteina (protein synthesis) and OpenFold 3. The open-source ecosystem spans biomedicine, physical AI, agent models, robotics, and autonomous driving.

NVIDIA’s Nemotron series—including speech, multimodal, and generative retrieval-enhanced models—has achieved top rankings and is widely adopted by enterprises.

Huang’s Strategy: From “Shovel” to Ecosystem

Amidst debates over an AI bubble, Huang must demonstrate what AI can truly do to reinforce confidence. Besides unveiling Vera Rubin’s computational power to address the compute shortage, he heavily invests in applications and software, making AI’s real-world impact clear.

Previously, NVIDIA made chips for virtual worlds; now, the company directly showcases focus on Physical AI—autonomous vehicles, humanoid robots—entering the physical realm. This is a crucial strategic shift: from selling tools (chips) to building the entire computing platform for the next AI generation.

Ultimately, only when the battle moves into the physical world can the weapons continue to be sold.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

2 Likes