Nvidia kicks off next-generation AI with Rubin

Nvidia has kickstarted the next generation of AI with the launch of the Nvidia Rubin platform, comprising six new chips designed to deliver one AI supercomputer.

The Rubin platform uses extreme codesign across the six chips — the Nvidia Vera CPU, Nvidia Rubin GPU, Nvidia NVLink 6 Switch, Nvidia ConnectX-9 SuperNIC, Nvidia BlueField-4 DPU and Nvidia Spectrum-6 Ethernet Switch — to slash training time and inference token costs.

“Rubin arrives at exactly the right moment, as AI computing demand for both training and inference is going through the roof,” says Jensen Huang, founder and CEO of Nvidia. “With our annual cadence of delivering a new generation of AI supercomputers — and extreme codesign across six new chips — Rubin takes a giant leap toward the next frontier of AI.”

Named for Vera Florence Cooper Rubin — the American astronomer whose discoveries transformed humanity’s understanding of the universe — the Rubin platform features the Nvidia Vera Rubin NVL72 rack-scale solution and the Nvidia HGX Rubin NVL8 system.

The Rubin platform introduces five innovations, including the latest generations of Nvidia NVLink interconnect technology, Transformer Engine, Confidential Computing and RAS Engine, as well as the Nvidia Vera CPU. These breakthroughs will accelerate agentic AI, advanced reasoning and massive-scale mixture-of-experts (MoE) model inference at up to 10x lower cost per token of the Nvidia Blackwell platform. Compared with its predecessor, the Nvidia Rubin platform trains MoE models with 4x fewer GPUs to accelerate AI adoption.

Broad ecosystem support
Among the AI labs, cloud service providers, computer makers and startups expected to adopt Rubin are Amazon Web Services (AWS), Anthropic, Black Forest Labs, Cisco, Cohere, CoreWeave, Cursor, Dell Technologies, Google, Harvey, HPE, Lambda, Lenovo, Meta, Microsoft, Mistral AI, Nebius, Nscale, OpenAI, OpenEvidence, Oracle Cloud Infrastructure (OCI), Perplexity, Runway, Supermicro, Thinking Machines Lab and xAI.

Sam Altman, CEO of OpenAI, says: “Intelligence scales with compute. When we add more compute, models get more capable, solve harder problems and make a bigger impact for people. The Nvidia Rubin platform helps us keep scaling this progress so advanced intelligence benefits everyone.”

Dario Amodei, cofounder and CEO of Anthropic, comments: “The efficiency gains in the Nvidia Rubin platform represent the kind of infrastructure progress that enables longer memory, better reasoning and more reliable outputs. Our collaboration with Nvidia helps power our safety research and our frontier models.”

According to Mark Zuckerberg, founder and CEO of Meta: “Nvidia’s Rubin platform promises to deliver the step-change in performance and efficiency required to deploy the most advanced models to billions of people.”

Elon Musk, founder and CEO of xAI, adds: “Nvidia Rubin will be a rocket engine for AI. If you want to train and deploy frontier models at scale, this is the infrastructure you use — and Rubin will remind the world that Nvidia is the gold standard.”

Satya Nadella, executive chairman and CEO of Microsoft, says: “We are building the world’s most powerful AI superfactories to serve any workload, anywhere, with maximum performance and efficiency. With the addition of Nvidia Vera Rubin GPUs, we will empower developers and organizations to create, reason and scale in entirely new ways.”

Mike Intrator, cofounder and CEO of CoreWeave, comments: “We built CoreWeave to help pioneers accelerate their innovations with the unmatched performance of our purpose-built AI platform, matching the right technology to the right workloads as they evolve. The Nvidia Rubin platform represents an important advancement for reasoning, agentic and large-scale inference workloads, and we’re excited to add it to our platform.

“With CoreWeave Mission Control as the operating standard, we can integrate new capabilities quickly and run them reliably at production scale, working in close partnership with Nvidia.”

Matt Garman, CEO of AWS, says: “AWS and Nvidia have been driving cloud AI innovation together for more than 15 years. The Nvidia Rubin platform on AWS represents our continued commitment to delivering cutting-edge AI infrastructure that gives customers unmatched choice and flexibility.

“By combining Nvidia’s advanced AI technology with AWS’s proven scale, security and comprehensive AI services, customers can build, train and deploy their most demanding AI applications faster and more cost effectively — accelerating their path from experimentation to production at any scale.”

Sundar Pichai, CEO of Google and Alphabet, says: “We are proud of our deep and long-standing relationship with Nvidia. To meet the substantial customer demand we see for Nvidia GPUs, we are focused on providing the best possible environment for their hardware on Google Cloud. Our collaboration will continue as we bring the impressive capabilities of the Rubin platform to our customers, offering them the scale and performance needed to advance the boundaries of AI.”

According to Clay Magouyrk, CEO of Oracle: “Oracle Cloud Infrastructure is a hyperscale cloud built for the highest performance, and together with Nvidia, we’re pushing the boundaries of what customers can build and scale with AI. With gigascale AI factories powered by the Nvidia Vera Rubin architecture, OCI is giving customers the infrastructure foundation they need to push the limits of model training, inference and real-world AI impact.”

Michael Dell, chairman and CEO of Dell Technologies, adds: “The Nvidia Rubin platform represents a major leap forward in AI infrastructure. By integrating Rubin into the Dell AI Factory with Nvidia, we’re building infrastructure that can handle massive token volumes and multistep reasoning while delivering the performance and resiliency that enterprises and neoclouds need to deploy AI at scale.”

Antonio Neri, president and CEO of HPE, says: “AI is reshaping not just workloads but the very foundations of IT, requiring us to reimagine every layer of infrastructure from the network to the compute. With the Nvidia Vera Rubin platform, HPE is building the next generation of secure, AI-native infrastructure, turning data into intelligence and enabling enterprises to become true AI factories.”

Yuanqing Yang, chairman and CEO of Lenovo, adds: “Lenovo is embracing the next-generation Nvidia Rubin platform, leveraging our Neptune liquid-cooling solution as well as our global scale, manufacturing efficiency and service reach, to help enterprises build AI factories that serve as intelligent, accelerated engines for insight and innovation. Together, we’re architecting an AI-driven future where efficient, secure AI becomes the standard for every organization.”

Engineered to scale intelligence
Agentic AI and reasoning models, along with state-of-the-art video generation workloads, are redefining the limits of computation. Multistep problem-solving requires models to process, reason and act across long sequences of tokens. Designed to serve the demands of complex AI workloads, the Rubin platform’s five groundbreaking technologies include:

Sixth-Generation Nvidia NVLink: Delivers the fast, seamless GPU-to-GPU communication required for today’s massive MoE models. Each GPU offers 3.6TB/s of bandwidth, while the Vera Rubin NVL72 rack provides 260TB/s — more bandwidth than the entire internet. With built-in, in-network compute to speed collective operations, as well as new features for enhanced serviceability and resiliency, Nvidia NVLink 6 switch enables faster, more efficient AI training and inference at scale.

Nvidia Vera CPU: Designed for agentic reasoning, Nvidia Vera is the most power‑efficient CPU for large-scale AI factories. The Nvidia CPU is built with 88 Nvidia custom Olympus cores, full Armv9.2 compatibility and ultrafast NVLink-C2C connectivity. Vera delivers exceptional performance, bandwidth and industry‑leading efficiency to support a full range of modern data center workloads.

Nvidia Rubin GPU: Featuring a third-generation Transformer Engine with hardware-accelerated adaptive compression, Rubin GPU delivers 50 petaflops of NVFP4 compute for AI inference.

Third-Generation Nvidia Confidential Computing: Vera Rubin NVL72 is the first rack-scale platform to deliver Nvidia Confidential Computing — which maintains data security across CPU, GPU and NVLink domains — protecting the world’s largest proprietary models, training and inference workloads.

Second-Generation RAS Engine: The Rubin platform — spanning GPU, CPU and NVLink — features real-time health checks, fault tolerance and proactive maintenance to maximize system productivity. The rack’s modular, cable-free tray design enables up to 18x faster assembly and servicing than Blackwell.

AI-native storage, SDI
Nvidia Rubin introduces Nvidia Inference Context Memory Storage Platform, a new class of AI-native storage infrastructure designed to scale inference context at gigascale.

Powered by Nvidia BlueField-4, the platform enables efficient sharing and reuse of key-value cache data across AI infrastructure, improving responsiveness and throughput while enabling predictable, power-efficient scaling of agentic AI.

As AI factories increasingly adopt bare-metal and multi-tenant deployment models, maintaining strong infrastructure control and isolation becomes essential.

BlueField-4 also introduces Advanced Secure Trusted Resource Architecture, or ASTRA, a system-level trust architecture that gives AI infrastructure builders a single, trusted control point to securely provision, isolate and operate large-scale AI environments without compromising performance.

With AI applications evolving toward multi-turn agentic reasoning, AI-native organizations must manage and share far larger volumes of inference context across users, sessions and services.

Different forms for different workloads
Nvidia Vera Rubin NVL72 offers a unified, secure system that combines 72 Nvidia Rubin GPUs, 36 Nvidia Vera CPUs, Nvidia NVLink 6, Nvidia ConnectX-9 SuperNICs and Nvidia BlueField-4 DPUs.

Nvidia will also offer the Nvidia HGX Rubin NVL8 platform, a server board that links eight Rubin GPUs through NVLink to support x86-based generative AI platforms. The HGX Rubin NVL8 platform accelerates training, inference and scientific computing for AI and high-performance computing workloads.

Nvidia DGX SuperPOD serves as a reference for deploying Rubin-based systems at scale, integrating either Nvidia DGX Vera Rubin NVL72 or DGX Rubin NVL8 systems with Nvidia BlueField-4 DPUs, Nvidia ConnectX-9 SuperNICs, Nvidia InfiniBand networking and Nvidia Mission Control software.

Next-Generation Ethernet networking
Advanced Ethernet networking and storage are components of AI infrastructure critical to keeping data centers running at full speed, improving performance and efficiency, and lowering costs.

Nvidia Spectrum-6 Ethernet is the next generation of Ethernet for AI networking, built to scale Rubin-based AI factories with higher efficiency and greater resilience, and enabled by 200G SerDes communication circuitry, co-packaged optics and AI-optimised fabrics.

Built on the Spectrum-6 architecture, Spectrum-X Ethernet Photonics co-packaged optical switch systems deliver 10x greater reliability and 5x longer uptime for AI applications while achieving 5x better power efficiency, maximizing performance per watt compared with traditional methods.

Spectrum-XGS Ethernet technology, part of the Spectrum-X Ethernet platform, enables facilities separated by hundreds of kilometers and more to function as a single AI environment.

Together, these innovations define the next generation of the Nvidia Spectrum-X Ethernet platform, engineered with extreme codesign for Rubin to enable massive-scale AI factories and pave the way for future million-GPU environments.

Rubin Readiness
Nvidia Rubin is in full production, and Rubin-based products will be available from partners the second half of 2026.

Among the first cloud providers to deploy Vera Rubin-based instances in 2026 will be AWS, Google Cloud, Microsoft and OCI, as well as Nvidia Cloud Partners CoreWeave, Lambda, Nebius and Nscale.