The latest AI models developed by Microsoft, including the Phi-3 family of small language models, are being optimized to run on Nvidia GPUs and made available as Nvidia NIM inference microservices.

Other microservices developed by Nvidia, such as the cuOpt route optimization AI, are regularly added to Microsoft Azure Marketplace as part of the Nvidia AI Enterprise software platform.

In addition to these AI technologies, Nvidia and Microsoft are delivering a growing set of optimisations and integrations for developers creating high-performance AI apps for PCs powered by Nvidia GeForce RTX and Nvidia RTX GPUs.

Building on the progress shared at Nvidia GTC, the two companies are furthering this ongoing collaboration at Microsoft Build, an annual developer event, taking place this year in Seattle through May 23.

Microsoft is expanding its family of Phi-3 open small language models, adding small (7-billion-parameter) and medium (14-billion-parameter) models similar to its Phi-3-mini, which has 3,8-billion parameters. It’s also introducing a new 4,2-billion-parameter multimodal model, Phi-3-vision, that supports images and text.

All of these models are GPU-optimised with Nvidia TensorRT-LLM and available as Nvidia NIMs, which are accelerated inference microservices with a standard application programming interface (API) that can be deployed anywhere.

Nvidia cuOpt, a GPU-accelerated AI microservice for route optimization, is now available in Azure Marketplace via Nvidia AI Enterprise.

cuOpt features massively parallel algorithms that enable real-time logistics management for shipping services, railway systems, warehouses and factories.

The model has set two dozen world records on major routing benchmarks, demonstrating the best accuracy and fastest times. It could save billions of dollars for the logistics and supply chain industries by optimising vehicle routes, saving travel time and minimizing idle periods.

Through Azure Marketplace, developers can easily integrate the cuOpt microservice with Azure Maps to support teal-time logistics management and other cloud-based workflows, backed by enterprise-grade management tools and security.

The Nvidia accelerated computing platform is the backbone of modern AI — helping developers build solutions for over 100-million Windows GeForce RTX-powered PCs and Nvidia RTX-powered workstations worldwide.

Nvidia and Microsoft are delivering new optimizations and integrations to Windows developers to accelerate AI in next-generation PC and workstation applications. These include:

* Faster inference performance for large language models via the Nvidia DirectX driver, the Generative AI ONNX Runtime extension and DirectML. These optimizations, available now in the GeForce Game Ready, Nvidia Studio and Nvidia RTX Enterprise Drivers, deliver up to 3x faster performance on Nvidia and GeForce RTX GPUs.

* Optimised performance on RTX GPUs for AI models like Stable Diffusion and Whisper via WebNN, an API that enables developers to accelerate AI models in web applications using on-device hardware.

* With Windows set to support PyTorch through DirectML, thousands of Hugging Face models will work in Windows natively. Nvidia and Microsoft are collaborating to scale performance on more than 100 million RTX GPUs.