Try NVIDIA NIM APIs

Stage 3 of Clinical ASR Flywheel. Score a NeMo manifest, produce the five-section KER leaderboard (by-ipa_source diagnostic). Not for ASR auth (/riva-asr).

Skill

Developer

902

1mo

Stage 2 of the Clinical ASR Flywheel. Use when curating clinical terms, tagging IPA, and synthesizing a NeMo manifest. NOT for scoring (use /digital-health-clinical-asr-eval).

Skill

Developer

901

1mo

Stage 1 of Clinical ASR Flywheel. Use when bootstrapping a cycle: NVCF+MW disclosure, NVIDIA_API_KEY check, deps install, TTS+ASR smoke test.

Skill

Developer

905

1mo

Stage 4 of the Clinical ASR Flywheel. Use when priority KER is above 0.3 to run stock NeMo SFT on Parakeet TDT v2 and offline cycle N+1 re-eval. NOT for generic word boosting (use /finetune-asr).

Skill

Developer

903

1mo

NVIDIA

Downloadable

parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

Model

Automatic Speech Recognition

19K

General

Developer Example

Nemotron Voice Agent

Build Real-Time Voice Agents with NVIDIA Nemotron NIM.

Blueprint

Voice Agent

4mo

NVIDIA

Downloadable

parakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

Model

ASR

9mo

NVIDIA

Downloadable

parakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

Model

ASR

123

9mo

NVIDIA

Downloadable

parakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

Model

ASR

13K

9mo

NVIDIA

Downloadable

parakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

Model

ASR

8mo

Use this skill to ask the VSS agent's video_understanding tool a fresh visual question about a recorded clip. Not for prior tool output, search hits, or metadata-answerable questions.

Skill

Developer

906

18d

NVIDIA

Downloadable

parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

Model

ASR

146K

11mo

OpenAI

Downloadable

whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

Model

ASR

195K

Healthcare & Life Sciences

LaunchableDeveloper Example

Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

Blueprint

NVIDIA AI

4mo

Routes NVIDIA Nemotron Speech (Riva) NIM tasks — deploys, runs, and tests ASR, TTS, and NMT NIMs on build.nvidia.com or self-hosted.

Skill

Developer

899

26d

Clone the latest NVIDIA Holoscan Sensor Bridge repo, ask which supported devkit is being used, configure the host per platform, build the correct demo container, run it, and verify HSB connectivity by pinging 192.168.0.2. Use for Holoscan Sensor Bridge se

Skill

Developer

789

18d

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

Model

LLM Multimodal Safety

222K

NVIDIA

Downloadable

llama-nemotron-embed-vl-1b-v2

Multimodal question-answer retrieval representing user queries as text and documents as images.

Model

nemo retriever

4mo

Validate and use MoE expert-parallel communication overlap in Megatron-Bridge, including overlap_moe_expert_parallel_comm, delay_wgrad_compute, and flex dispatcher backends such as DeepEP and HybridEP.

Skill

Developer

900

1mo