NVIDIA
Explore Models Blueprints GPUs Docs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

Deploy Models Now with NVIDIA NIM

Optimized inference for the world’s leading models
Free serverless APIs for developmentAccelerated by DGX Cloud
Self-Host on your GPU infrastructure
Continuous vulnerability fixes
Discover
Models
Blueprints
GPUs
Docs
Forums
models
ReasoningVisionVisual DesignRetrievalSpeechBiologySimulationClimate & WeatherSafety & Moderation
industries
AutomotiveGamingHealthcareIndustrialRobotics

Speech

Automatic Speech Recognition (ASR)

Connect generative AI models to speech by transcribing spoken audio to text.

Run Anywhere

nvidiaparakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

asrmultilingualnvidia nimstreamingspeech-to-text
Run Anywhere

nvidiaparakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

asrenglishnvidia nimstreamingbatchspeech-to-text
Run Anywhere

nvidiaparakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

asrbatchenglishfastnvidia nimrun-on-rtxstreamingspeech-to-text
Run Anywhere

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivaspanishbatchstreamingspeech-to-text
Run Anywhere

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

asrastmultilingualnvidia nimnvidia rivaopenaibatchspeech-to-textwhisper
Run Anywhere

nvidiacanary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivabatchfastspeech-to-text
Run Anywhere

nvidiaconformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case English with record-setting accuracy and performance

asrnvidia nimnvidia rivaspanishstreamingspeech-to-text

Convert Text to Speech (TTS)

Voice generative AI models by converting written text to spoken audio.

PREVIEW

nvidiamagpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

nvidia nimnvidia rivattstext-to-speech
Run Anywhere

nvidiamagpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

nvidia nimnvidia rivattsmultilingualtext-to-speech
Run Anywhere

nvidiafastpitch-hifigan-tts

Expressive and engaging English voices for Q&A assistants, brand ambassadors, and service robots

nvidia nimtext-to-speech

Neural Machine Translation (NMT) & Audio Speech Translation (AST)

Create multilingual generative AI models by translating speech and text between languages.

Run Anywhere

nvidiamegatron-1b-nmt

Enable smooth global interactions in 36 languages.

nvidia nimneural machine translationtext translation
Run Anywhere

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivaspanishbatchstreamingspeech-to-text
Run Anywhere

nvidiacanary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastmultilingualnvidia nimnvidia rivabatchfastspeech-to-text
Run Anywhere

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

asrastmultilingualnvidia nimnvidia rivaopenaibatchspeech-to-textwhisper

Speech Enhancement

Speech enhancing AI models for common voice degradations.

Run Anywhere

nvidiastudiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

digital humannvidia maxinerun-on-rtxspeech enhancementspeech-to-speech