NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K

Deploy Models Now with NVIDIA NIM

Optimized inference for the world’s leading models
Free serverless APIs for development
Accelerated by DGX Cloud
Self-Host on your GPU infrastructure
Continuous vulnerability fixes
DiscoverModelsBlueprintsGPUsDocsForums

workstations

  • Run on RTX
  • Run on Spark

models

  • Reasoning
  • Vision
  • Visual Design
  • Retrieval
  • Speech
  • Biology
  • Simulation
  • Climate & Weather
  • Safety & Moderation

industries

  • Automotive
  • Financial Services
  • Gaming
  • Healthcare
  • Industrial
  • Robotics

Speech

Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

Speech Enhancement

Speech enhancing AI models for common voice degradations.

Run Anywhere

nvidiastudiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

Digital HumanNvidia MaxineRun-on-RTXSpeech EnhancementSpeech-to-speech
Run Anywhere

nvidiaBackground Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

Digital HumanNvidia MaxineSpeech EnhancementSpeech-to-speech

Convert Text to Speech (TTS)

Voice generative AI models by converting written text to spoken audio.

Run Anywhere

nvidiamagpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

NVIDIA NIMNVIDIA RivaTTSmultilingualText-to-Speech

nvidiamagpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA NIMNVIDIA RivaTTSText-to-Speech

nvidiamagpie-tts-flow

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA NIMNVIDIA RivaTTSText-to-Speech

Automatic Speech Recognition (ASR)

Connect generative AI models to speech by transcribing spoken audio to text.

Run Anywhere

nvidiaparakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

ASRNVIDIA NIMStreamingTaiwaneseSpeech-to-Text
Run Anywhere

nvidiaparakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

ASREnglishNVIDIA NIMNVIDIA Rivaspeech-to-text
Run Anywhere

nvidiaparakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

ASREnglishNVIDIA NIMStreamingbatchSpeech-to-Text
Run Anywhere

nvidiaparakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

ASRBatchEnglishFastNVIDIA NIMRun-on-RTXStreamingSpeech-to-Text
Run Anywhere

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Automatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva
Run Anywhere

nvidiaparakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

Automatic Speech RecognitionNVIDIA NIMNVIDIA RivaSpeech-to-Text
Run Anywhere

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

ASRASTMultilingualNVIDIA NIMNVIDIA RivaOpenAIbatchSpeech-to-Textwhisper
Run Anywhere

nvidiaparakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

ASRNVIDIA NIMStreamingVietnameseSpeech-to-Text
Run Anywhere

nvidiaparakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

ASRMandarinNVIDIA NIMStreamingSpeech-to-Text
Run Anywhere

nvidiaparakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

ASRNVIDIA NIMSpanishStreamingSpeech-to-Text

Neural Machine Translation (NMT) & Audio Speech Translation (AST)

Create multilingual generative AI models by translating speech and text between languages.

nvidiariva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.

neural machine translationnvidia nimText Translation
Run Anywhere

nvidiariva-translate-1.6b

Enable smooth global interactions in 36 languages.

NVIDIA NIMNeural machine translationText Translation
Run Anywhere

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Automatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva
Run Anywhere

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

ASRASTMultilingualNVIDIA NIMNVIDIA RivaOpenAIbatchSpeech-to-Textwhisper