NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta
Sorting by

ibmgranite-3.3-8b-instruct

Small language model fine-tuned for improved reasoning, coding, and instruction-following

codingReasoningchatInstruction Following

mistralaimistral-small-3.1-24b-instruct-2503

Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses

language generationchatmultimodalimage understanding

mistralaimistral-small-24b-instruct

Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.

codechatreasoningagent-centricmultilingual

metallama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chatCode GenerationText-to-TextLanguage Generation

metallama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chatCode GenerationText-to-TextLanguage Generation

nvidiamistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generationtext-to-textchatsmall language model

googlegemma-2-2b-it

Advanced small language generative AI model for edge applications

chatCode GenerationText-to-TextLanguage Generation

microsoftphi-3-small-8k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chatCode GenerationText-to-TextLanguage GenerationLarge Language Models

microsoftphi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

chatCode GenerationText-to-TextLanguage GenerationLarge Language Models