
Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

Accurate and optimized English transcriptions with punctuation and word timestamps

Sensor-captured radio enables real-time awareness, AI-driven analytics for actionable, searchable insights.

Expressive and engaging text-to-speech, generated from a short audio sample.

Translation model in 12 languages with few-shots example prompts capability.

Enable smooth global interactions in 36 languages.

Expressive and engaging text-to-speech, generated from a short audio sample.

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

Robust Speech Recognition via Large-Scale Weak Supervision.

Multi-lingual model supporting speech-to-text recognition and translation.