Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
15 models
Sort By
dateCreated:DESC
Most Recent
Z.ai
Free Endpoint
glm-4.7
GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.
Tool Calling
+3
Items per page
24
1
1
of 1 pages
5.5M
1w
NVIDIA
Free Endpoint
nemotron-3-content-safety
Multilingual, multimodal model for detecting unsafe and toxic content.
llm safety
+3
29.12K
1w
NVIDIA
Downloadable
llama-nemotron-embed-1b-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Text-to-Embedding
+2
2.01M
1mo
NVIDIA
Free Endpoint
llama-3.1-nemotron-safety-guard-8b-v3
Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs
content moderation
+4
115K
6mo
NVIDIA
Downloadable
llama-3_2-nemoretriever-300m-embed-v2
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Text-to-Embedding
+2
86
6mo
Sarvamai
Downloadable
sarvam-m
Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.
coding
+5
144K
9mo
NVIDIA
Free Endpoint
llama-3_2-nemoretriever-300m-embed-v1
Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
Text-to-Embedding
+2
313K
9mo
Mistral AI
Deprecation in 12d
Free Endpoint
magistral-small-2506
High performance reasoning model optimized for efficiency and edge deployment
coding
+3
1.17M
9mo
NVIDIA
Downloadable
parakeet-1.1b-rnnt-multilingual-asr
High accuracy and optimized performance for transcription in 25 languages
Automatic Speech Recognition
+3
12.1K
1y
Meta
Free Endpoint
llama-4-maverick-17b-128e-instruct
A general purpose multimodal, multilingual 128 MoE model with 17B parameters.
language generation
+3
12.54M
9mo
NVIDIA
Downloadable
magpie-tts-multilingual
Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.
TTS
+4
44.73K
10mo
Microsoft
Downloadable
phi-4-mini-instruct
Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
Chat
+3
532K
11mo
OpenAI
Downloadable
whisper-large-v3
Robust Speech Recognition via Large-Scale Weak Supervision.
ASR
+8
44.5K
1y
NVIDIA
Downloadable
llama-3.2-nv-embedqa-1b-v2
Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
nemo retriever
+3
2.4M
9mo
NVIDIA
Downloadable
llama-3.2-nv-rerankqa-1b-v2
Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
nemo retriever
+2
96.93K
9mo