Nvidia unveils next-gen RTX GPUs, AI workflows in the cloud

New cloud services to support AI workflows and the launch of a new generation of GeForce RTX GPUs featured yesterday in Nvidia CEO Jensen Huang’s GTC keynote.

“Computing is advancing at incredible speeds, the engine propelling this rocket is accelerated computing, and its fuel is AI,” Huang said during a virtual presentation.

Gamers and creators will get the first GPUs based on the new Nvidia Ada Lovelace architecture.

Enterprises will get powerful new tools for high-performance computing applications with systems based on the Grace CPU and Grace Hopper Superchip. Those building the 3D internet will get new OVX servers powered by Ada Lovelace L40 data centre GPUs. Researchers and computer scientists get new large language model capabilities with Nvidia LLMs NeMo Service. And the auto industry gets Thor, a new brain with an astonishing 2 000 teraflops of performance.

Huang highlighted how Nvidia’s technologies are being put to work by a sweep of major partners and customers across a breadth of industries.

To speed adoption, he announced Deloitte, the world’s largest professional services firm, is bringing new services built on Nvidia AI and Nvidia Omniverse to the world’s enterprises.

GeForce RTX 40 Series GPUs

First out of the blocks at the keynote was the launch of next-generation GeForce RTX 40 Series GPUs powered by Ada, which Huang called a “quantum leap” that paves the way for creators of fully simulated worlds.

Huang gave the audience a taste of what that makes possible by offering up a look at Racer RTX, a fully interactive simulation that’s entirely ray traced, with all the action physically modeled.

Ada’s advancements include a new Streaming Multiprocessor, a new RT Core with twice the ray-triangle intersection throughput, and a new Tensor Core with the Hopper FP8 Transformer Engine and 1.4 petaflops of Tensor processor power.

Ada also introduces the latest version of Nvidia DLSS technology, DLSS 3, which uses AI to generate new frames by comparing new frames with prior frames to understand how a scene is changing. The result: boosting game performance by up to 4x over brute force rendering.

DLSS 3 has received support from many of the world’s leading game developers, with more than 35 games and applications announcing support. “DLSS 3 is one of our greatest neural rendering inventions,” Huang said.

Together, Huang said, these innovations help deliver 4x more processing throughput with the new GeForce RTX 4090 versus its forerunner, the RTX 3090 Ti.

Additionally, the new GeForce RTX 4080 is launching in November with two configurations.

The GeForce RTX 4080 12GB has 7 680 CUDA cores and 12GB of Micron GDDR6X memory, and with DLSS 3 is faster than the RTX 3090 Ti, the previous-generation flagship GPU. It’s priced at $899.

Huang also announced that Nvidia Lightspeed Studios used Omniverse to reimagine Portal. With Nvidia RTX Remix, an AI-assisted toolset, users can mod their favourite games, enabling them to up-res textures and assets, and give materials physically accurate properties.

H100 GPU in full production

Once more tying systems and software to broad technology trends, Huang explained that large language models, or LLMs, and recommender systems are the two most important AI models today.

Recommenders “run the digital economy,” powering everything from e-commerce to entertainment to advertising, he said. “They’re the engines behind social media, digital advertising, e-commerce and search.”

And large language models based on the Transformer deep learning model first introduced in 2017 are now among the most vibrant areas for research in AI, and able to learn to understand human language without supervision or labeled datasets.

“A single pre-trained model can perform multiple tasks, like question answering, document summarization, text generation, translation and even software programming,” Huang said.

Delivering the computing muscle needed to power these enormous models, Huang said the Nvidia H100 Tensor Core GPU, with Hopper’s next-generation Transformer Engine, is in full production, with systems shipping in the coming weeks.

“Hopper is in full production and coming soon to power the world’s AI factories,” Huang said.

Partners building systems include Atos, Cisco, Dell Technologies, Fujitsu, Gigabyte, Hewlett Packard Enterprise, Lenovo and Supermicro. And Amazon Web Services, Google Cloud, Microsoft Azure and Oracle Cloud Infrastructure will be among the first to deploy H100-based instances in the cloud starting next year.

Grace Hopper, which combines Nvidia’s Arm-based Grace data center CPU with Hopper GPUs, with its 7-times increase in fast-memory capacity, will deliver a “giant leap” for recommender systems, Huang said. Systems incorporating Grace Hopper will be available in the first half of 2023.

L40 data centre GPUs in full production

The next evolution of the internet, called the metaverse, will be extended with 3D, Huang explained. Omniverse is Nvidia’s platform for building and running metaverse applications.

Here, too, Huang explained how connecting and simulating these worlds will require powerful, flexible new computers. And Nvidia OVX servers are built for scaling out metaverse applications.

Nvidia’s 2nd-generation OVX systems will be powered by Ada Lovelace L40 data center GPUs, which are now in full production, Huang announced.

Thor for autonomous vehicles, robotics, medical instruments and more

In today’s vehicles, active safety, parking, driver monitoring, camera mirrors, cluster and infotainment are driven by different computers. In the future, they’ll be delivered by software that improves over time, running on a centralized computer, Huang said.

To power this, Huang introduced Drive Thor, which combines the transformer engine of Hopper, the GPU of Ada, and the amazing CPU of Grace.

The new Thor superchip delivers 2 000 teraflops of performance, replacing Atlan on the DRIVE roadmap, and providing a seamless transition from DRIVE Orin, which has 254 TOPS of performance and is currently in production vehicles. Thor will be the processor for robotics, medical instruments, industrial automation and edge AI systems, Huang said.

3,5m developers, 3 000 accelerated applications

Bringing Nvidia’s systems and silicon, and the benefits of accelerated computing, to industries around the world, is a software ecosystem with more than 3,5-million developers creating some 3 000 accelerated apps using Nvidia’s 550 software development kits, or SDKs, and AI models, Huang announced.

And it’s growing fast. Over the past 12 months, Nvidia has updated more than 100 SDKs and introduced 25 new ones.

“New SDKs increase the capability and performance of systems our customers already own, while opening new markets for accelerated computing,” Huang said.

New services for AI, virtual worlds

Large language models “are the most important AI models today,” Huang said. Based on the transformer architecture, these giant models can learn to understand meanings and languages without supervision or labeled datasets, unlocking remarkable new capabilities.

To make it easier for researchers to apply this “incredible” technology to their work, Huang announced the Nemo LLM Service, an Nvidia-managed cloud service to adapt pretrained LLMs to perform specific tasks.

To accelerate the work of drug and bioscience researchers, Huang also announced BioNeMo LLM, a service to create LLMs that understand chemicals, proteins, DNA and RNA sequences.

Huang announced that Nvidia is working with The Broad Institute, the world’s largest producer of human genomic information, to make Nvidia Clara libraries, such as Nvidia Parabricks, the Genome Analysis Toolkit, and BioNeMo, available on Broad’s Terra Cloud Platform.

Nvidia is working with The Broad Institute, the world’s largest producer of human genomic information, to make Nvidia Clara libraries available on Broad’s Terra Cloud Platform.

Huang also detailed Nvidia Omniverse Cloud, an infrastructure-as-a-service that connects Omniverse applications running in the cloud, on premises or on a device.

New Omniverse containers – Replicator for synthetic data generation, Farm for scaling render farms, and Isaac Sim for building and training AI robots – are now available for cloud deployment, Huang announced.

New Jetson Orin Nano for robotics

Shifting from virtual worlds to machines that will move through their world, robotic computers “are the newest types of computers,” Huang said, describing Nvidia’s second-generation processor for robotics, Orin, as a homerun.

To bring Orin to more markets, he announced the Jetson Orin Nano, a tiny robotics computer that is 80x faster than the previous super-popular Jetson Nano.

Jetson Orin Nano runs the NVIDIA Isaac robotics stack and features the ROS 2 GPU-accelerated framework, and Nvidia Iaaac Sim, a robotics simulation platform, is available on the cloud.

And for robotics developers using AWS RoboMaker, Huang announced that containers for the Nvidia Isaac platform for robotics development are in the AWS marketplace.

New tools for video, image services

Most of the world’s internet traffic is video, and user-generated video streams will be increasingly augmented by AI special effects and computer graphics, Huang explained.

“Avatars will do computer vision, speech AI, language understanding and computer graphics in real time and at cloud scale,” Huang said.

To enable new innovations at the intersection of real-time graphics, AI and communications possible, Huang announced Nvidia has been building acceleration libraries like CV-CUDA, a cloud runtime engine called UCF Unified Computing Framework, Omniverse ACE Avatar Cloud Engine, and a sample application called Tokkio for customer service avatars.

Deloitte to bring AI, omniverse services to enterprises

And to speed the adoption of all these technologies to the world’s enterprises, Deloitte, the world’s largest professional services firm, is bringing new services built on Nvidia AI and Nvidia Omniverse to the world’s enterprises, Huang announced.

He said that Deloitte’s professionals will help the world’s enterprises use Nvidia application frameworks to build modern multi-cloud applications for customer service, cybersecurity, industrial automation, warehouse and retail automation and more.

You can watch Huang’s keynote here: