NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K

Deploy Models Now with NVIDIA NIM

Optimized inference for the world’s leading models
Free serverless APIs for development
Accelerated by DGX Cloud
Self-Host on your GPU infrastructure
Continuous vulnerability fixes
DiscoverModelsBlueprintsGPUsDocsForums

workstations

  • Run on RTX
  • Run on Spark

models

  • Reasoning
  • Vision
  • Visual Design
  • Retrieval
  • Speech
  • Biology
  • Simulation
  • Climate & Weather
  • Safety & Moderation

industries

  • Automotive
  • Financial Services
  • Gaming
  • Healthcare
  • Industrial
  • Robotics

Speech

Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Speech Enhancement

Speech enhancing AI models for common voice degradations.

Download Available

nvidiastudiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

Digital HumanNvidia MaxineRun-on-RTXSpeech EnhancementSpeech-to-speech
Download Available

nvidiaBackground Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

Digital HumanNvidia MaxineSpeech EnhancementSpeech-to-speech

Automatic Speech Recognition (ASR)

Low Latency NVIDIA Nemotron Speech transcription models for your agentic AI workflows.

Download Available

nvidiaparakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

ASRNVIDIA NIMStreamingTaiwaneseSpeech-to-Text
Download Available

nvidiaparakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

ASREnglishNVIDIA NIMNVIDIA Rivaspeech-to-text
Download Available

nvidiaparakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

ASREnglishNVIDIA NIMStreamingbatchSpeech-to-Text
Download Available

nvidiaparakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

ASRBatchEnglishFastNVIDIA NIMRun-on-RTXStreamingSpeech-to-Text
Download Available

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Automatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva
Download Available

nvidiaparakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

Automatic Speech RecognitionNVIDIA NIMNVIDIA RivaSpeech-to-Text
Download Available

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

ASRASTMultilingualNVIDIA NIMNVIDIA RivaOpenAIbatchSpeech-to-Textwhisper
Download Available

nvidiaparakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

ASRNVIDIA NIMStreamingVietnameseSpeech-to-Text
Download Available

nvidiaparakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

ASRMandarinNVIDIA NIMStreamingSpeech-to-Text
Download Available

nvidiaparakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

ASRNVIDIA NIMSpanishStreamingSpeech-to-Text

Neural Machine Translation (NMT) & Audio Speech Translation (AST)

Enable seamless multilingual global communication across dozens of languages with NVIDIA Nemotron Speech models.

API Endpoint

nvidiariva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.

neural machine translationnvidia nimText Translation
Download Available

nvidiariva-translate-1.6b

Enable smooth global interactions in 36 languages.

NVIDIA NIMNeural machine translationText Translation
Download Available

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Automatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva
Download Available

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

ASRASTMultilingualNVIDIA NIMNVIDIA RivaOpenAIbatchSpeech-to-Textwhisper

Convert Text to Speech (TTS)

Convert written text to spoken audio in multiple languages with NVIDIA Nemotron Speech models.

Download Available

nvidiamagpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

NVIDIA NIMNVIDIA RivaTTSmultilingualText-to-Speech
API Endpoint

nvidiamagpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA NIMNVIDIA RivaTTSText-to-Speech
API Endpoint

nvidiamagpie-tts-flow

Expressive and engaging text-to-speech, generated from a short audio sample.

NVIDIA NIMNVIDIA RivaTTSText-to-Speech