High-performance computing (HPC) systems, including supercomputers, outclass all other classes of computing in terms of calculation speed by parallelising processing over many processors.
HPC has long been an integral tool across critical industries, from facilitating engineering modeling to predicting the weather, writes Sam Dale, senior technology analyst at IDTechEx. The AI boom has intensified development in the sector, growing the capabilities of hardware technologies, including accelerators, interconnects, and memory, at astounding rates.
A new report from IDTechEx, “Hardware for HPC and AI 2025-2035: Technologies, Markets, Forecasts”, predicts that the growing use-cases of AI and machine learning and scale of the largest AI models will drive significant growth in this already-large market. This results in a HPC hardware market forecast of $581-billion by 2035, growing at a CAGR of 13,5% from this year.
The largest exascale HPC systems in the world, which can perform over 1 quintillion high precision floating-point operations per second, are the preserve of government-operated national labs of global superpowers. However, private corporations are not far behind, with cloud providers like Microsoft Azure possessing supercomputers in the upper echelons of performance. Growth here is only expected to accelerate as the deployment of AI widens.
IDTechEx’s report examines the key hardware challenges in HPC, which revolve around processors, memory and storage, advanced packaging, interconnects, and thermal Nvidia.
However, substantial progress has also taken place in the CPU market, with core counts approaching 200 in chips like AMD’s Turin processors and the deployment of integrated heterogeneous systems – GPUs and CPUs integrated into the same package – taking place in the world’s most powerful supercomputers, including the US’s El Capitan and China’s Tianhe-3.
In the memory space, the adoption of high bandwidth memory (HBM) has been crucial to the growing capabilities of accelerators, with approximately 95% of accelerators in HPC now employing the technology, but research into next-generation memory technologies such as selector-only memory (SOM), phase change memory (PCRAM) and magnetoresistive RAM (MRAM) is being driven by the high energy consumption of today’s memory choices.
The quest for high-capacity storage that meets the increasing performance requirements of HPC workloads is seeing the emergence of high-density QLC SSDs at a lower cost-point than typical TLC SSDs available in the market.
Advanced packaging is a key enabler of all these chip technologies, with the 3D stacking of memory on top of logic being the best approach to achieve ultra-high bandwidths by shrinking on-device interconnect distances.
Advanced packaging techniques are also highly important for the roadmaps of the interconnects between HPC nodes, as future iterations of low-latency networks like Nvidia’s InfiniBand and HPE’s Slingshot move towards co-packaged optics to further reduce the distances data must travel at the chip scale and speed up throughput overall.
Of course, these advances evolve large amounts of heat, so thermal management performance is key. Despite immersion cooling’s strong performance in operational expense and performance, cold plate cooling is expected to remain the technology of choice in most cases as HPC builders seek to reduce the massive capital expenses that come with building these enormous interconnected machines.