Nvidia Hopper sets AI inference records

In their debut on the MLPerf industry-standard AI benchmarks, Nvidia H100 Tensor Core GPUs set world records in inference on all workloads, delivering up to 4,5-times more performance than previous-generation GPUs.

Additionally, Nvidia A100 Tensor Core GPUs and the Nvidia Jetson AGX Orin module for AI-powered robotics continued to deliver overall leadership inference performance across all MLPerf tests: image and speech recognition, natural language processing and recommender systems.

The H100, also known as Hopper, raised the bar in per-accelerator performance across all six neural networks in the round. It demonstrated leadership in both throughput and speed in separate server and offline scenarios.

The Nvidia Hopper architecture delivered up to 4.5x more performance than Nvidia Ampere architecture GPUs, which continue to provide overall leadership in MLPerf results.

Thanks in part to its Transformer Engine, Hopper excelled on the popular BERT model for natural language processing. It’s among the largest and most performance-hungry of the MLPerf AI models.

These inference benchmarks mark the first public demonstration of H100 GPUs, which will be available later this year. The H100 GPUs will participate in future MLPerf rounds for training.

Nvidia A100 GPUs continued to show overall leadership in mainstream performance on AI inference in the latest tests. A100 GPUs won more tests than any submission in data centre and edge computing categories and scenarios. In June, the A100 also delivered overall leadership in MLPerf training benchmarks, demonstrating its abilities across the AI workflow.

Since their July 2020 debut on MLPerf, A100 GPUs have advanced their performance by 6-times.

In edge computing, Nvidia Orin ran every MLPerf benchmark, winning more tests than any other low-power system-on-a-chip. And it showed up to a 50% gain in energy efficiency compared to its debut on MLPerf in April.

In the previous round, Orin ran up to 5-times faster than the prior-generation Jetson AGX Xavier module, while delivering an average of 2-times better energy efficiency.

Orin integrates into a single chip an Nvidia Ampere architecture GPU and a cluster of powerful Arm CPU cores. It’s available in the Nvidia Jetson AGX Orin developer kit and production modules for robotics and autonomous systems, and supports the full Nvidia AI software stack, including platforms for autonomous vehicles (Nvidia Hyperion), medical devices (Clara Holoscan) and robotics (Isaac).