Nvidia to accelerate AI workloads on Azure

Nvidia continues to collaborate with Microsoft to build AI infrastructure, with Microsoft introducing additional H100-based virtual machines to Microsoft Azure to accelerate demanding AI workloads.

At its Ignite conference in Seattle yesterday, Microsoft announced its new NC H100 v5 VM series for Azure, the industry’s first cloud instances featuring Nvidia H100 NVL GPUs.

The offering brings together a pair of PCIe-based H100 GPUs connected via Nvidia NVLink, with nearly 4 petaflops of AI compute and 188GB of faster HBM3 memory. The Nvidia H100 NVL GPU can deliver up to 12x higher performance on GPT-3 175B over the previous generation and is ideal for inference and mainstream training workloads.

In addition, Microsoft announced plans to add the Nvidia H200 Tensor Core GPU to its Azure fleet next year to support larger model inferencing with no increase in latency. This new offering is purpose-built to accelerate the largest AI workloads, including large language models (LLMs) and generative AI (GenAI) models.

The H200 GPU brings dramatic increases both in memory capacity and bandwidth using the latest-generation HBM3e memory. Compared to the H100, the new GPU will offer 141GB of HBM3e memory (1,8x more) and 4.8 TB/s of peak memory bandwidth (a 1,4x increase).

Cloud computing gets confidential

Further expanding availability of Nvidia-accelerated generative AI computing for Azure customers, Microsoft announced another Nvidia-powered instance: the NCC H100 v5.

These Azure confidential VMs with Nvidia H100 Tensor Core GPUs allow customers to protect the confidentiality and integrity of their data and applications in use, in memory, while accessing the unsurpassed acceleration of H100 GPUs.