Explore
Models
Blueprints
GPUs
Docs
⌘K
Ctrl+K
?
Login
Models
Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIA
Launch from Hugging Face
Beta
Filters
24 models
Sort By
dateCreated:DESC
Most Recent
NVIDIA
gliner-pii
GLiNER PII detects Personally Identifiable Information in text.
PII Detection
+1
108
1d
NVIDIA
riva-translate-4b-instruct-v1_1
Translation model in 12 languages with few-shots example prompts capability.
nvidia nim
+2
407K
2mo
NVIDIA
parakeet-ctc-0.6b-zh-tw
Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.
ASR
+4
284
4mo
NVIDIA
parakeet-ctc-0.6b-zh-cn
Record-setting accuracy and performance for Mandarin English transcriptions.
ASR
+4
5.82K
5mo
NVIDIA
parakeet-ctc-0.6b-es
Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.
ASR
+4
5mo
NVIDIA
parakeet-ctc-0.6b-vi
Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.
ASR
+4
116
5mo
NVIDIA
parakeet-tdt-0.6b-v2
Accurate and optimized English transcriptions with punctuation and word timestamps
ASR
+4
2.37K
7mo
NVIDIA
magpie-tts-flow
Expressive and engaging text-to-speech, generated from a short audio sample.
TTS
+3
784
7mo
NVIDIA
riva-translate-1.6b
Enable smooth global interactions in 36 languages.
Text Translation
+2
632K
8mo
NVIDIA
magpie-tts-zeroshot
Expressive and engaging text-to-speech, generated from a short audio sample.
TTS
+3
1.1K
8mo
NVIDIA
parakeet-1.1b-rnnt-multilingual-asr
High accuracy and optimized performance for transcription in 25 languages
Automatic Speech Recognition
+3
40.98K
10mo
NVIDIA
magpie-tts-multilingual
Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.
TTS
+4
27.53K
8mo
OpenAI
whisper-large-v3
Robust Speech Recognition via Large-Scale Weak Supervision.
ASR
+8
43.49K
10mo
NVIDIA
canary-1b-asr
Multi-lingual model supporting speech-to-text recognition and translation.
Automatic Speech Recognition
+3
969
10mo
NVIDIA
audio2face-3d
Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
Speech-to-Animation
+3
8mo
NVIDIA
nv-dinov2
NV-DINOv2 is a visual foundation model that generates vector embeddings for the input image.
Image-to-Embedding
+4
951K
11mo
NVIDIA
nv-grounding-dino
Grounding dino is an open vocabulary zero-shot object detection model.
Object Detection
+3
3.15K
11mo
NVIDIA
megatron-1b-nmt
Enable smooth global interactions in 36 languages.
Text Translation
+2
10mo
NVIDIA
parakeet-ctc-1.1b-asr
Record-setting accuracy and performance for English transcription.
ASR
+5
21.1K
8mo
NVIDIA
parakeet-ctc-0.6b-asr
State-of-the-art accuracy and speed for English transcriptions.
ASR
+7
7.39K
8mo
NVIDIA
maisi
MAISI is a pre-trained volumetric (3D) CT Latent Diffusion Generative Model.
Image Generation
+2
631
11mo
NVIDIA
visual-changenet
Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask
image
+8
592
1y
NVIDIA
retail-object-detection
EfficientDet-based object detection network to detect 100 specific retail objects from an input video.
Object Detection
+7
778
1y
Mistral AI
mistral-7b-instruct-v0.2
This LLM follows instructions, completes requests, and generates creative text.
chat
+3
467K
9mo
Items per page
24
1
1
of 1 pages