NVIDIA Grows Collaboration With Microsoft
NVIDIA is showcasing integrated solutions with Microsoft Azure and Windows PCs, simplifying AI model deployment and optimizing route mapping.
May 24, 2024
The latest artificial intelligence (AI) models developed by Microsoft, including the Phi-3 family of small language models, are being optimized to run on NVIDIA graphics processing units (GPUs) and made available as NVIDIA NIM inference microservices. Other microservices developed by NVIDIA, such as the cuOpt route optimization AI, are regularly added to Microsoft Azure Marketplace as part of the NVIDIA AI Enterprise software platform.
In addition to these AI technologies, NVIDIA and Microsoft are delivering a set of optimizations and integrations for developers creating high-performance AI apps for PCs powered by NVIDIA GeForce RTX and NVIDIA RTX GPUs.
Accelerating Microsoft’s Phi-3 Models
Microsoft is expanding its family of Phi-3 open small language models, adding small (7-billion-parameter) and medium (14-billion-parameter) models similar to its Phi-3-mini, which has 3.8 billion parameters. It’s also introducing a 4.2-billion-parameter multimodal model, Phi-3-vision, that supports images and text.
All of these models are GPU-optimized with NVIDIA TensorRT-LLM and available as NVIDIA NIMs, which are accelerated inference microservices with a standard application programming interface (API) that can be deployed anywhere.
Application programming interfaces for the NIM-powered Phi-3 models are available at ai.nvidia.com and through NVIDIA AI Enterprise on the Azure Marketplace.
NVIDIA cuOpt Now Available on Azure Marketplace
NVIDIA cuOpt, a GPU-accelerated AI microservice for route optimization, is now available in Azure Marketplace via NVIDIA AI Enterprise. cuOpt features parallel algorithms that enable real-time logistics management for shipping services, railway systems, warehouses and factories.
The model has set two dozen world records on major routing benchmarks.
Through Azure Marketplace, developers can easily integrate the cuOpt microservice with Azure Maps to support teal-time logistics management and other cloud-based workflows, backed by enterprise-grade management tools and security.
Optimizing AI Performance on PCs With NVIDIA RTX
NVIDIA and Microsoft are delivering new optimizations and integrations to Windows developers to accelerate AI in next-generation PC and workstation applications. These include:
Faster inference performance for large language models via the NVIDIA DirectX driver, the Generative AI ONNX Runtime extension and DirectML. These optimizations, available in the GeForce Game Ready, NVIDIA Studio and NVIDIA RTX Enterprise Drivers, deliver faster performance on NVIDIA and GeForce RTX GPUs.
Optimized performance on RTX GPUs for AI models like Stable Diffusion and Whisper via WebNN, an API that enables developers to accelerate AI models in web applications using on-device hardware.
With Windows set to support PyTorch through DirectML, thousands of Hugging Face models will work in Windows natively. NVIDIA and Microsoft are collaborating to scale performance on more than 100 million RTX GPUs.
Sources: Press materials received from the company and additional information gleaned from the company’s website.
More NVIDIA Coverage
Subscribe to our FREE magazine,
FREE email newsletters or both!About the Author
DE EditorsDE’s editors contribute news and new product announcements to Digital Engineering.
Press releases may be sent to them via [email protected].