NVIDIA
Explore Models Blueprints GPUs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

ModelsExplore Models
BlueprintsGet Started with Blueprints
GPUsLaunch a GPU Instance

Deploy Models Now with NVIDIA NIM

Optimized inference for the world’s leading models
Free serverless APIs for developmentAccelerated by DGX Cloud
Self-Host on your GPU infrastructure
Continuous vulnerability fixes
DiscoverModelsBlueprintsGPUs
Docs
Forums
models
ReasoningVisionVisual DesignRetrievalSpeechBiologySimulationClimate & WeatherSafety & Moderation
industries
AutomotiveGamingHealthcareIndustrialRobotics

Speech

Automatic Speech Recognition (ASR)

Connect generative AI models to speech by transcribing spoken audio to text.

Run Anywhere

nvidiaparakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

asrmultilingualnvidia nimstreamingspeech-to-text
Run Anywhere

nvidiaparakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

asrenglishnvidia nimstreamingbatchspeech-to-text
Run Anywhere

nvidiaparakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

asrbatchenglishfastnvidia nimrun on rtxstreamingspeech-to-text
Run Anywhere

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivaspanishbatchstreamingspeech-to-text
Run Anywhere

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

asrastmultilingualnvidia nimnvidia rivaopenaibatchspeech-to-textwhisper
Run Anywhere

nvidiacanary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivabatchfastspeech-to-text
Run Anywhere

nvidiaconformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case English with record-setting accuracy and performance

asrnvidia nimnvidia rivaspanishstreamingspeech-to-text

Convert Text to Speech (TTS)

Voice generative AI models by converting written text to spoken audio.

PREVIEW

nvidiamagpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

nvidia nimnvidia rivattstext-to-speech
Run Anywhere

nvidiamagpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

nvidia nimnvidia rivattsmultilingualtext-to-speech
Run Anywhere

nvidiafastpitch-hifigan-tts

Expressive and engaging English voices for Q&A assistants, brand ambassadors, and service robots

nvidia nimtext-to-speech

Neural Machine Translation (NMT) & Audio Speech Translation (AST)

Create multilingual generative AI models by translating speech and text between languages.

Run Anywhere

nvidiamegatron-1b-nmt

Enable smooth global interactions in 36 languages.

nvidia nimneural machine translationtext translation
Run Anywhere

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivaspanishbatchstreamingspeech-to-text
Run Anywhere

nvidiacanary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivabatchfastspeech-to-text
Run Anywhere

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

asrastmultilingualnvidia nimnvidia rivaopenaibatchspeech-to-textwhisper

Speech Enhancement

Speech enhancing AI models for common voice degradations.

Run Anywhere

nvidiastudiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

digital humannvidia maxinerun on rtxspeech enhancementspeech-to-speech