Aurora supercomputer ranks fastest for AI

Intel, in collaboration with Argonne National Laboratory and Hewlett Packard Entrerprise (HPE), has announced that the Aurora supercomputer has broken the exascale barrier at 1.012 exaflops and is the fastest AI system in the world dedicated to AI for open science – achieving 10.6 AI exaflops.

“The Aurora supercomputer surpassing exascale will allow it to pave the road to tomorrow’s discoveries,” says Ogi Brkic, Intel vice-president and GM of Data Centre AI Solutions. “From understanding climate patterns to unravelling the mysteries of the universe, supercomputers serve as a compass guiding us toward solving truly difficult scientific challenges that may improve humanity.”

Designed as an AI-centric system from its inception, Aurora will allow researchers to harness generative AI models to accelerate scientific discovery. Significant progress has been made in Argonne’s early AI-driven research. Success stories include mapping the human brain’s 80-billion neurons, high-energy particle physics enhanced by deep learning, and drug design and discovery accelerated by machine learning, among others.

The Aurora supercomputer is an expansive system with 166 racks, 10 624 compute blades, 21 248 Intel Xeon CPU Max Series processors and 63 744 Intel Data Center GPU Max Series units, making it one of the world’s largest GPU clusters. Aurora also includes the largest open, Ethernet-based supercomputing interconnect on a single system of 84 992 HPE slingshot fabric endpoints.

The Aurora supercomputer came in second on the high-performance LINPACK (HPL) benchmark, but broke the exascale barrier at 1.012 exaflops utilizing 9 234 nodes – only 87% of the system. It also secured the third spot on the high-performance conjugate gradient (HPCG) benchmark at 5 612 teraflops per second (TF/s) with 39% of the machine.

This benchmark aims to assess more realistic scenarios providing insights into communication and memory access patterns which are important factors in real-world HPC applications. It complements benchmarks like LINPACK by offering a comprehensive view of a system’s capabilities.

At the heart of the Aurora supercomputer is the Intel Data Centre GPU Max Series. The Intel Xe GPU architecture is foundational to the Max Series, featuring specialised hardware like matrix and vector compute blocks optimised for both AI and HPC tasks. The Intel Xe architecture’s design that delivers unparalleled compute performance is the reason the Aurora supercomputer secured the top spot in the high-performance LINPACK-mixed precision (HPL-MxP) benchmark – which best highlights the importance of AI workloads in HPC.

The Xe architecture’s parallel processing capabilities excel in managing the intricate matrix-vector operations inherent in neural network AI computation.

These compute cores are pivotal in accelerating matrix operations crucial for deep learning models. Complemented by Intel’s suite of software tools, including Intel oneAPI DPC++/C++ Compiler, a rich set of performance libraries, and optimised AI frameworks and tools, the Xe architecture fosters an open ecosystem for developers that is characterised by flexibility and scalability across various devices and form factors.