Try NVIDIA NIM APIs

Explore

Models

Skills

Blueprints

8 results for

Filters

Free Endpoint

Partner Endpoint

Download Available

Use Case

Speech-to-Text

Image-to-Text

Inference Providers

Deepinfra

OpenRouter

Together AI

Bitdeer

Publisher

NVIDIA

Google

Microsoft

OpenAI

Sort By

NVIDIA

Downloadable

nemotron-asr-streaming

Real-time speech recognition for English

Model

Automatic Speech Recognition

3mo

Items per page

of 1 pages

Google

Free Endpoint

gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

Model

language generation

34M

11mo

NVIDIA

Downloadable

parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

Model

Automatic Speech Recognition

19K

NVIDIA

Downloadable

canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Model

Automatic Speech Recognition

52K

Google

Free Endpoint

gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

Model

language generation

11mo

Microsoft

Deprecation in 7dFree Endpoint

phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

Model

Speech Recognition

173K

OpenAI

Downloadable

whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

Model

ASR

161K

NVIDIA

Downloadable

conformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case Spanish with record-setting accuracy and performance

Model

ASR