Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
7 models
Sort By
dateCreated:DESC
Most Recent
Mistral AI
API Endpoint
mistral-small-3.1-24b-instruct-2503
Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses
chat
+3
1.8M
9mo
Meta
Downloadable
llama-3.2-3b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
chat
+4
15.84K
690K
9mo
Meta
Downloadable
llama-3.2-1b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
chat
+4
16K
330K
9mo
NVIDIA
API Endpoint
mistral-nemo-minitron-8b-base
State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.
language generation
+3
4.75K
1y
Google
API Endpoint
gemma-2-2b-it
Advanced small language generative AI model for edge applications
chat
+4
564K
9mo
Microsoft
API Endpoint
phi-3-small-8k-instruct
Cutting-edge lightweight open language model exceling in high-quality reasoning.
chat
+5
541K
9mo
Microsoft
API Endpoint
phi-3-small-128k-instruct
Long context cutting-edge lightweight open language model exceling in high-quality reasoning.
chat
+5
643K
9mo
Items per page
24
1
1
of 1 pages