IBM has announced the imminent general availability of IBM Spyre Accelerator – an AI accelerator enabling low-latency inferencing to support generative and agentic AI use cases while prioritising the security and resilience of core workloads.

Earlier this year, IBM announced the Spyre Accelerator would be available in IBM z17, LinuxONE 5, and Power11 systems. The company says Spyre will now be generally available on 28 October for IBM z17 and LinuxONE 5 systems, and in early December for Power11 servers.

Today’s IT landscape is changing from traditional logic workflows to agentic AI inferencing.

AI agents require low-latency inference and realtime system responsiveness. IBM recognised the need for mainframes and servers to run AI models along with the most demanding enterprise workloads without compromising on throughput. To address this demand, clients need AI inferencing hardware that supports generative and agentic AI while maintaining the security and resilience of core data, transactions, and applications.

The accelerator is also built to enable clients to keep mission-critical data on-prem to mitigate risk while addressing operational and energy efficiency.

Initially introduced as a prototype chip, Spyre was refined through rapid iteration, including cluster deployments at IBM’s Yorktown Heights campus, and with collaborators like the University at Albany’s Center for Emerging Artificial Intelligence Systems.

The IBM Research prototype has evolved into an enterprise-grade product for use in IBM Z, LinuxONE and Power systems.

Today, the Spyre Accelerator is a commercial system-on-a-chip with 32 individual accelerator cores and 25,6-billion transistors. Produced using 5nm node technology, each Spyre is mounted on a 75-watt PCIe card, which makes it possible to cluster up to 48 cards in an IBM Z or LinuxONE system or 16 cards in an IBM Power system to scale AI capabilities.

“One of our key priorities has been advancing infrastructure to meet the demands of new and emerging AI workloads,” says Barry Baker, COO, IBM Infrastructure & GM, IBM Systems. “With the Spyre Accelerator, we’re extending the capabilities of our systems to support multi-model AI – including generative and agentic AI. This innovation positions clients to scale their AI-enabled mission-critical workloads with uncompromising security, resilience, and efficiency while unlocking the value of their enterprise data.”

Mukesh Khare, GM of IBM Semiconductors and vice-president of Hybrid Cloud at IBM, adds: “We launched the IBM Research AI Hardware Centre in 2019 with a mission to meet the rising computational demands of AI, even before the surge in LLMs and AI models we’ve recently seen. Now, amid increasing demand for advanced AI capabilities, we’re proud to see the first chip from the centre enter commercialisation, designed to deliver improved performance and productivity to IBM’s mainframe and server clients.”