Enterprises are rapidly adopting generative AI (GenAI), large language models (LLMs), advanced graphics, and digital twins to increase operational efficiencies, reduce costs, and drive innovation.

However, to adopt these technologies effectively, enterprises need access to state-of-the-art, full-stack accelerated computing platforms.

To meet this demand, Oracle Cloud Infrastructure (OCI) has launched Nvidia L40S GPU bare-metal instances available to order and the upcoming availability of a new virtual machine accelerated by a single Nvidia H100 Tensor Core GPU. This new VM expands OCI’s existing H100 portfolio which includes an Nvidia HGX H100 8-GPU bare-metal instance.

Paired with Nvidia networking and running the Nvidia software stack, these platforms deliver powerful performance and efficiency, enabling enterprises to advance GenAI.

The Nvidia L40S is a universal data centre GPU designed to deliver breakthrough multi-workload acceleration for generative AI, graphics and video applications. Equipped with fourth-generation Tensor Cores and support for the FP8 data format, the L40S GPU excels in training and fine-tuning small- to mid-size LLMs and in inference across a wide range of generative AI use cases.

For example, a single L40S GPU (FP8) can generate up to1,4x more tokens per second than a single Nvidia A100 Tensor Core GPU (FP16) for Llama 3 8B with Nvidia TensorRT-LLM at an input and output sequence length of 128.

The L40S GPU also has best-in-class graphics and media acceleration. Its third-generation Nvidia Ray Tracing Cores (RT Cores) and multiple encode/decode engines make it ideal for advanced visualisation and digital twin applications.

The L40S GPU delivers up to 3.8x the realtime ray-tracing performance of its predecessor and supports Nvidia DLSS 3 for faster rendering and smoother frame rates. This makes the GPU ideal for developing applications on the Nvidia Omniverse platform, enabling realtime, photo-realistic 3D simulations and AI-enabled digital twins.

With Omniverse on the L40S GPU, enterprises can develop advanced 3D applications and workflows for industrial digitalisation that will allow them to design, simulate and optimise products, processes, and facilities in realtime before going into production.

OCI will offer the L40S GPU in its BM.GPU.L40S.4 bare-metal compute shape, featuring four Nvidia L40S GPUs, each with 48GB of GDDR6 memory. This shape includes local NVMe drives with 7.38TB capacity, 4th Generation Intel Xeon CPUs with 112 cores and 1TB of system memory.

These shapes eliminate the overhead of any virtualisation for high-throughput and latency-sensitive AI or machine learning workloads with OCI’s bare-metal compute architecture. The accelerated compute shape features the Nvidia BlueField-3 DPU for improved server efficiency, offloading data centre tasks from CPUs to accelerate networking, storage, and security workloads. The use of BlueField-3 DPUs furthers OCI’s strategy of off-box virtualisation across its entire fleet.