Nvidia and Google Cloud are expanding Google Cloud AI Hypercomputer for AI factories that they believe will  power the next frontier of agentic and physical AI.

These include the new Nvidia Vera Rubin-powered A5X bare-metal instances; a preview of Google Gemini on Google Distributed Cloud running on Nvidia Blackwell and Nvidia Blackwell Ultra GPUs; confidential VMs with Nvidia Blackwell GPUs; and agentic AI on Gemini Enterprise Agent Platform with Nvidia Nemotron open models and the Nvidia NeMo framework.

Google has announced A5X powered by Nvidia Vera Rubin NVL72 rack-scale systems, which — through extreme codesign across chips, systems and software — deliver up to 10x lower inference cost per token and 10x higher token throughput per megawatt than the prior generation.

A5X will use Nvidia ConnectX-9 SuperNICs, combined with next-generation Google Virgo networking, scaling to up to 80 000 Nvidia Rubin GPUs within a single site cluster and up to 960 000 Nvidia Rubin GPUs in a multisite cluster, enabling customers to run their largest AI workloads on Nvidia‑optimised infrastructure.

“At Google Cloud, we believe the next decade of AI will be shaped by customers’ ability to run their most demanding workloads on a truly integrated, AI‑optimized infrastructure stack,” says Mark Lohmeyer, vice-president and GM of AI and computing infrastructure at Google Cloud.

“By combining Google Cloud’s scalable infrastructure and managed AI services with Nvidia’s industry‑leading platforms, systems and software, we’re giving customers flexibility to train, tune and serve everything from frontier and open models to agentic and physical AI workloads — while optimizing for performance, cost and sustainability.”

Google Cloud’s broad Nvidia Blackwell portfolio ranges from A4 VMs with Nvidia HGX B200 systems to rack-scale A4X VMs with Nvidia GB200 NVL72 and A4X Max Nvidia GB300 NVL72 systems, all the way to fractional G4 VMs with Nvidia RTX PRO 6000 Blackwell Server Edition GPUs.

Customers can right-size their acceleration capabilities, whether using multiple interconnected NVL72 racks that scale out to tens of thousands of Nvidia Blackwell GPUs, a single rack that can scale up to 72 Blackwell GPUs with fifth-generation Nvidia NVLink and NVLink 5 Switch, or just one-eighth of a GPU.

The platform helps teams optimise every workload, from mixture-of-experts reasoning, multimodal inference and data processing to complex simulations for the next frontier of physical AI and robotics.

 

Google Gemini models running on Nvidia Blackwell and Blackwell Ultra GPUs are now in preview on Google Distributed Cloud, so customers can bring Google’s frontier models wherever their most sensitive data resides.

Nvidia Confidential Computing with the Nvidia Blackwell platform enables Gemini models to run in a protected environment where prompts and fine‑tuning data stay encrypted and can’t be seen or altered by unauthorized parties, including the infrastructure operators.

In the public cloud, the preview of Confidential G4 VMs with Nvidia RTX PRO 6000 Blackwell GPUs brings these protections to multi‑tenant environments — helping safeguard prompts, AI models and data so customers in regulated industries can access the power of AI without compromising on security or performance.

The Nvidia platform on Google Cloud is optimized to run every kind of model — from Google’s frontier Gemini and Gemma families to Nvidia Nemotron open models and the broader open weight ecosystem — equipping developers to build agentic AI systems that reason, plan and act.

Nvidia Nemotron 3 Super is available on Gemini Enterprise Agent Platform, giving developers a direct path to discovering, customizing and deploying Nvidia‑optimized reasoning and multimodal models for agentic workflows.

Google Cloud and Nvidia are also making it easier to train and customize open models at scale. Managed Training Clusters on Gemini Enterprise Agent Platform introduced a new managed reinforcement learning (RL) API built with Nvidia NeMo RL for accelerating RL training at scale while automating cluster sizing, failure recovery and job execution, so teams can focus on agent behavior and model quality instead of infrastructure management.

Nvidia AI infrastructure, open models and physical AI libraries available on Google Cloud, is mainstreaming industrial and physical AI applications, enabling customers to simulate, optimize and automate real-world workflows.

Solutions from leading industrial software providers, including Cadence and Siemens Digital Industries Software, are now available on Google Cloud, accelerated on Nvidia AI infrastructure.

With Nvidia Omniverse libraries and the open source Nvidia Isaac Sim robotics simulation framework available on Google Cloud Marketplace, developers can build physically accurate digital twins and develop custom robotics simulations pipelines to train, simulate and validate robots before real-world deployment.

Nvidia NIM microservices for models like Nvidia Cosmos Reason 2 can be deployed to Google Vertex AI and Google Kubernetes Engine. This enables robots and vision AI agents to see, reason and act in the physical world like humans, powering use cases such as automated data curation and annotation, advanced robot planning and reasoning, and intelligent video analytics agents for real-time insights and decision-making.