As generative AI and large language models (LLMs) continue to drive innovations, compute requirements for training and inference have grown at an astonishing pace.
To meet that need, Google Cloud today announced the general availability of its new A3 instances, powered by Nvidia H100 Tensor Core GPUs. These GPUs bring unprecedented performance to all kinds of AI applications with their Transformer Engine — purpose-built to accelerate LLMs.
Availability of the A3 instances comes on the heels of Nvidia being named Google Cloud’s Generative AI Partner of the Year — an award that recognizes the companies’ deep and ongoing collaboration to accelerate generative AI on Google Cloud.
The joint effort takes multiple forms, from infrastructure design to extensive software enablement, to make it easier to build and deploy AI applications on the Google Cloud platform.
At the Google Cloud Next conference, Nvidia founder and CEO Jensen Huang joined Google Cloud CEO Thomas Kurian for the event keynote to celebrate the general availability of Nvidia H100 GPU-powered A3 instances and speak about how Google is using Nvidia H100 and A100 GPUs for internal research and inference in its DeepMind and other divisions.
During the discussion, Huang pointed to the deeper levels of collaboration that enabled Nvidia GPU acceleration for the PaxML framework for creating massive LLMs. This Jax-based machine learning framework is purpose-built to train large-scale models, allowing advanced and fully configurable experimentation and parallelisation.
PaxML has been used by Google to build internal models, including DeepMind as well as research projects, and will use Nvidia GPUs. The companies also announced that PaxML is available immediately on the Nvidia NGC container registry.
There are currently over 1 000 generative AI startups building next-generation applications, many using Nvidia technology on Google Cloud.
Google Cloud was the first CSP to bring the Nvidia L4 GPU to the cloud. In addition, the companies have collaborated to enable Google’s Dataproc service to leverage the Rapids Accelerator for Apache Spark to provide significant performance boosts for ETL.
The companies have also made Nvidia AI Enterprise available on Google Cloud Marketplace and integrated Nvidia acceleration software into the Vertex AI development environment.