At its annual Intel Vision 2024 conference, the chipmaker introduced the Intel Gaudi 3 accelerator to bring performance, openness and choice to enterprise generative AI (GenAI) – along with a suite of new open scalable systems, next-gen products, and strategic collaborations to accelerate GenAI adoption.
With only 10% of enterprises successfully moving GenAI projects into production last year, Intel says its latest offerings address the challenges businesses face in scaling AI initiatives.
“Innovation is advancing at an unprecedented pace, all enabled by silicon – and every company is quickly becoming an AI company,” says Intel CEO Pat Gelsinger. “Intel is bringing AI everywhere across the enterprise – from the PC to the data centre to the edge. Our latest Gaudi, Xeon and Core Ultra platforms are delivering a cohesive set of flexible solutions tailored to meet the changing needs of our customers and partners and capitalize on the immense opportunities ahead.”
Enterprises are looking to scale GenAI from pilot to production. To do so, they need readily available solutions, built on performant and cost- and energy-efficient processors like the Intel Gaudi 3 AI accelerator, that also address complexity, fragmentation, data security, and compliance requirements.
The Intel Gaudi 3 AI accelerator will power AI systems with up to tens of thousands of accelerators connected through the common standard of Ethernet. Intel Gaudi 3 promises 4x more AI compute for BF16 and a 1.5x increase in memory bandwidth over its predecessor. Thee accelerator will deliver a significant leap in AI training and inference for global enterprises looking to deploy GenAI at scale.
In comparison to Nvidia H100, Intel Gaudi 3 is projected to deliver 50% faster time-to-train on average across Llama2 models with 7B and 13B parameters, and GPT-3 175B parameter model. Additionally, Intel Gaudi 3 accelerator inference throughput is projected to outperform the H100 by 50% on average and 40% for inference power-efficiency averaged across Llama 7B and 70B parameters, and Falcon 180B parameter models.
Intel Gaudi 3 provides open, community-based software and industry-standard Ethernet networking. And it allows enterprises to scale flexibly from a single node to clusters, super-clusters and mega-clusters with thousands of nodes, supporting inference, fine-tuning and training at the largest scale.
Intel Gaudi 3 will be available to OEMs – including Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro – in the second quarter of 2024.
At Vision 2024, Intel also outlined its strategy for open scalable AI systems including hardware, software, frameworks and tools. Intel’s approach enables a broad, open ecosystem of AI players to offer solutions that satisfy enterprise-specific GenAI needs. This includes equipment manufacturers, database providers, systems integrators, software and service providers, and others. It also allows enterprises to use the ecosystem partners and solutions that they already know and trust.
Intel also announced collaborations with Google Cloud, Thales and Cohesity to leverage Intel’s confidential computing capabilities in their cloud instances. This includes Intel Trust Domain Extensions (Intel TDX), Intel Software Guard Extensions (Intel SGX) and Intel’s attestation service. Customers can run their AI models and algorithms in a trusted execution environment (TEE) and leverage Intel’s trust services for independently verifying the trustworthiness of these TEEs.
In collaboration with Anyscale, Articul8, DataStax, Domino, Hugging Face, KX Systems, MariaDB, MinIO, Qdrant, Red Hat, Redis, SAP, VMware, Yellowbrick, and Zilliz Intel announced the intention to create an open platform for enterprise AI. The industrywide effort aims to develop open, multivendor GenAI systems that deliver best-in-class ease-of-deployment, performance and value, enabled by retrieval-augmented generation. RAG enables enterprises’ vast, existing proprietary data sources running on standard cloud infrastructure to be augmented with open LLM capabilities, accelerating GenAI use in enterprises.
As initial steps in this effort, Intel will release reference implementations for GenAI pipelines on secure Intel Xeon and Gaudi-based solutions, publish a technical conceptual framework, and continue to add infrastructure capacity in the Intel Tiber Developer Cloud for ecosystem development and validation of RAG and future pipelines.
Intel encourages further participation of the ecosystem to join forces in this open effort to facilitate enterprise adoption, broaden solution coverage and accelerate business results.
Intel also provided updates on its next-generation products and services across all segments of enterprise AI.
New Intel Xeon 6 Processors: Intel Xeon processors offer performance-efficient solutions to run current GenAI solutions, including RAG, that produce business-specific results using proprietary data. Intel introduced the new brand for its next-generation processors for data centres, cloud and edge: Intel Xeon 6. Intel Xeon 6 processors with new Efficient-cores (E-cores) will deliver exceptional efficiency and launch this quarter, while Intel Xeon 6 with Performance-cores (P-cores) will offer increased AI performance and launch soon after the E-core processors.
Intel Xeon 6 processors with E-cores (codenamed Sierra Forest): 2,4x performance per watt improvement4 and 2.7x better rack density compared with 2nd Gen Intel Xeon processors. Customers can replace older systems at a ratio of nearly three-to-one, drastically lowering energy consumption and helping meet sustainability goals.
Intel Xeon 6 processors with P-cores (codenamed Granite Rapids): Incorporate software support for the MXFP4 data format which reduces next token latency by up to 6.5x versus 4th Gen Intel Xeon processors using FP16, with the ability to run 70-billion parameter Llama-2 models.
Intel also announced momentum for client and updates to its roadmap for edge and connectivity including:
- Intel Core Ultra processors are powering new capabilities for productivity, security and content creation, providing a great motivation for businesses to refresh their PC fleets. Intel expects to ship 40-million AI PCs in 2024, with more than 230 designs, from ultra-thin PCs to handheld gaming devices.
- Next-generation Intel Core Ultra client processor family (codenamed Lunar Lake), launching in 2024, will have more than 100 platform tera operations per second (TOPS) and more than 45 neural processing unit (NPU) TOPS for next-generation AI PCs.
Intel announced new edge silicon across the Intel Core Ultra, Intel Core and Intel Atom processor and Intel Arc graphics processing unit (GPU) families of products, targeting key markets including retail, industrial manufacturing, and healthcare. All new additions to Intel’s edge AI portfolio will be available this quarter and will be supported by the Intel Tiber Edge Platform this year.
Through the Ultra Ethernet Consortium (UEC), Intel is leading open Ethernet networking for AI fabrics, introducing an array of AI-optimised Ethernet solutions.
Designed to transform large scale-up and scale-out AI fabrics, these innovations enable training and inferencing for increasingly vast models, with sizes expanding by an order of magnitude in each generation.
The lineup includes the Intel AI NIC, AI connectivity chiplets for integration into XPUs, Gaudi-based systems, and a range of soft and hard reference AI interconnect designs for Intel Foundry.
At Vision 2024, the company also unveiled the Intel Tiber portfolio of business solutions to streamline the deployment of enterprise software and services, including for GenAI.
A unified experience makes it easier for enterprise customers and developers to find solutions that fit their needs, accelerate innovation, and unlock value without compromising on security, compliance, or performance.
Customers can begin exploring the Intel Tiber portfolio now, with a full rollout planned for the third quarter of 2024.