Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
9 models
Sort By
dateCreated:DESC
Most Recent
IBM
Free Endpoint
granite-3.3-8b-instruct
Small language model fine-tuned for improved reasoning, coding, and instruction-following
coding
+3
113K
8mo
Mistral AI
Free Endpoint
mistral-small-3.1-24b-instruct-2503
Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses
chat
+3
2.32M
10mo
Mistral AI
Downloadable
mistral-small-24b-instruct
Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.
chat
+4
577K
9mo
Meta
Downloadable
llama-3.2-3b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
chat
+4
30.73K
977K
10mo
Meta
Downloadable
llama-3.2-1b-instruct
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
chat
+4
16K
324K
10mo
NVIDIA
Free Endpoint
mistral-nemo-minitron-8b-base
State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.
language generation
+3
4.16K
1y
Google
Free Endpoint
gemma-2-2b-it
Advanced small language generative AI model for edge applications
chat
+4
555K
10mo
Microsoft
Free Endpoint
phi-3-small-8k-instruct
Cutting-edge lightweight open language model exceling in high-quality reasoning.
chat
+5
532K
10mo
Microsoft
Free Endpoint
phi-3-small-128k-instruct
Long context cutting-edge lightweight open language model exceling in high-quality reasoning.
chat
+5
571K
10mo
Items per page
24
1
1
of 1 pages