Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
60 models
Sort By
dateCreated:DESC
Most Recent
Mistral AI
API Endpoint
mistral-large-3-675b-instruct-2512
A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.
chat
+4
6.69M
3mo
Mistral AI
Downloadable
ministral-14b-instruct-2512
A general purpose VLM ideal for chat and instruction based use cases
chat
+4
4.67M
3mo
NVIDIA
Downloadable
nemotron-nano-12b-v2-vl
Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.
chat
+4
1.4M
4mo
Speakleash
API Endpoint
bielik-11b-v2.6-instruct
State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.
chat
+4
582K
5mo
Google
API Endpoint
gemma-3n-e4b-it
An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments
chat
+4
746K
8mo
Google
API Endpoint
gemma-3n-e2b-it
An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments
chat
+4
696K
8mo
Mistral AI
API Endpoint
mistral-nemotron
Built for agentic workflows, this model excels in coding, instruction following, and function calling
chat
+3
790K
9mo
Utter-project
Downloadable
eurollm-9b-instruct
State-of-the-art, multilingual model tailored to all 24 official European Union languages.
chat
+6
4.72K
530K
8mo
Gotocompany
Downloadable
gemma-2-9b-cpt-sahabatai-instruct
SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.
chat
+5
531K
8mo
Mistral AI
API Endpoint
mistral-small-3.1-24b-instruct-2503
Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses
chat
+3
1.8M
9mo
Mistral AI
API Endpoint
mistral-medium-3-instruct
Powerful, multimodal language model designed for enterprise applications, including software development, data analysis, and reasoning.
chat
+4
5.28M
8mo
Meta
API Endpoint
llama-4-maverick-17b-128e-instruct
A general purpose multimodal, multilingual 128 MoE model with 17B parameters.
chat
+4
3.25M
7mo
Meta
Downloadable
API Endpoint
llama-4-scout-17b-16e-instruct
A multimodal, multilingual 16 MoE model with 17B parameters.
chat
+4
156K
8mo
Google
API Endpoint
gemma-3-27b-it
Cutting-edge open multimodal model exceling in high-quality reasoning from images.
chat
+4
5.79M
9mo
Google
Downloadable
gemma-3-1b-it
A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications
chat
+4
4.34K
554K
9mo
Microsoft
Downloadable
phi-4-mini-instruct
Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
chat
+4
2.79M
9mo
Microsoft
API Endpoint
phi-4-multimodal-instruct
Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.
Speech Recognition
+5
532K
9mo
Tiiuae
API Endpoint
falcon3-7b-instruct
Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities
chat
+6
1.83M
9mo
Qwen
Downloadable
qwen2.5-7b-instruct
Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.
Chinese Language Generation
+4
1.77M
9mo
Qwen
Downloadable
qwen2.5-coder-32b-instruct
Advanced LLM for code generation, reasoning, and fixing across popular programming languages.
chat
+3
5.82M
8mo
Qwen
API Endpoint
qwen2.5-coder-7b-instruct
Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.
chat
+3
588K
9mo
NVIDIA
API Endpoint
nemotron-4-mini-hindi-4b-instruct
A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.
Indic
+4
536K
9mo
Institute of Science Tokyo
Downloadable
llama-3.1-swallow-70b-instruct-v0.1
Sovereign AI model trained on Japanese language that understands regional nuances.
chat
+4
526K
9mo
Institute of Science Tokyo
Downloadable
llama-3.1-swallow-8b-instruct-v0.1
Sovereign AI model trained on Japanese language that understands regional nuances.
chat
+4
536K
9mo
Items per page
24
1
1
2
2
3
3
of 3 pages