Deploy Models Now with NVIDIA NIM

Optimized inference for the world’s leading models
Free serverless APIs for developmentAccelerated by DGX Cloud
Self-Host on your GPU infrastructure
Continuous vulnerability fixes

Discover

Build with gpt-oss: OpenAI's Latest Open-Weight Reasoning Model

Try Now

Achieves near-parity with o4-mini on core reasoning benchmarks, while running efficiently on a single 80 GB GPU.

Featured Models

View All

The leading open models built by the community, optimized and accelerated by NVIDIA's enterprise-ready inference runtime.

Customize a Blueprint

View All

Get started with workflows and code samples to build AI applications from the ground up.