New AWS chips take on foundation model training

Amazon Web Services has announced the next generation of two AWS-designed chip families – AWS Graviton4 and AWS Trainium2 – delivering advancements in price performance and energy efficiency for a broad range of customer workloads, including machine learning (ML) training and generative artificial intelligence (AI) applications.

Graviton4 provides up to 30% better compute performance, 50% more cores, and 75% more memory bandwidth than current generation Graviton3 processors, delivering the best price performance and energy efficiency for a broad range of workloads running on Amazon EC2.

Trainium2 is designed to deliver up to 4x faster training than first generation Trainium chips and will be able to be deployed in EC2 UltraClusters of up to 100,000 chips, making it possible to train foundation models (FMs) and large language models (LLMs) in a fraction of the time, while improving energy efficiency up to 2x.

“Silicon underpins every customer workload, making it a critical area of innovation for AWS,” says David Brown, vice-president of compute and networking at AWS. “By focusing our chip designs on real workloads that matter to customers, we’re able to deliver the most advanced cloud infrastructure to them.

“Graviton4 marks the fourth generation we’ve delivered in just five years, and is the most powerful and energy efficient chip we have ever built for a broad range of workloads. And with the surge of interest in generative AI, Tranium2 will help customers train their ML models faster, at a lower cost, and with better energy efficiency.”

Currently, AWS offers more than 150 different Graviton-powered Amazon EC2 instance types globally at scale, has built more than 2-million Graviton processors, and has more than 50 000 customers – including the top 100 EC2 customers – using Graviton-based instances.

Graviton is supported by AWS managed services including Amazon Aurora, Amazon ElastiCache, Amazon EMR, Amazon MemoryDB, Amazon OpenSearch, Amazon Relational Database Service (Amazon RDS), AWS Fargate, and AWS Lambda, bringing Graviton’s price performance benefits to users of those services.

Graviton4 processors deliver up to 30% better compute performance, 50% more cores, and 75% more memory bandwidth than Graviton3. It also fully encrypts all high speed physical hardware interfaces. Graviton4 will be available in memory-optimised Amazon EC2 R8g instances, enabling customers to improve the execution of their high-performance databases, in memory caches, and big data analytics workloads.

R8g instances offer larger instance sizes with up to 3x more vCPUs and 3x more memory than current generation R7g instances. This allows customers to process larger amounts of data, scale their workloads, improve time-to-results, and lower their total cost of ownership. Graviton4-powered R8g instances are available today in preview, with general availability planned in the coming months.

The FMs and LLMs behind today’s emerging generative AI applications are trained on massive datasets. The most advanced FMs and LLMs today range from hundreds of billions to trillions of parameters, requiring reliable high-performance compute capacity capable of scaling across 10s of thousands of ML chips.

Trainium2 chips are purpose-built for high performance training of FMs and LLMs with up to trillions of parameters. Trainium2 is designed to deliver up to 4x faster training performance and 3x more memory capacity compared to first generation Trainium chips, while improving energy efficiency (performance/watt) up to 2x. T

rainium2 will be available in Amazon EC2 Trn2 instances, containing 16 Trainium chips in a single instance.

Trn2 instances are intended to enable customers to scale up to 100 000 Trainium2 chips in next generation EC2 UltraClusters, interconnected with AWS Elastic Fabric Adapter (EFA) petabit-scale networking, delivering up to 65 exaflops of compute and giving customers on-demand access to supercomputer-class performance.

With this level of scale, customers can train a 300-billion parameter LLM in weeks versus months.