NVIDIA
Explore
Models
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

Ambient Healthcare Agents

Healthcare, the world’s largest customer service sector, faces urgent pressure to digitize patient interactions. Ambient voice AI is key, however, the next leap will come from Generative AI reasoning models—these will enable voice agents to provide intelligent, context-aware responses, automate documentation, and deliver highly personalized care, fundamentally transforming workflows and scaling efficient, accurate healthcare delivery. This developer example provides developers with the ingredients to build and scale such agents. This developer example has two use cases - Ambient Provider Voice Agent and Ambient Patient Voice Agent. The Ambient Provider Voice Agent does more than transcribe patient-provider conversations. It understands context, infers intent, and generates nuanced, structured clinical documentation in the SOAP note format (Subjective, Objective, Assessment, Plan) —autonomously, reducing manual input and supporting better clinical decisions. The Ambient Patient Voice Agent manages high-volume patient touchpoints (e.g., clinic intake, surveys, appointment scheduling, information queries) without clinician involvement. Its ability to reason dynamically allows for more personalized, empathetic patient interactions and real-time problem-solving within complex healthcare contexts.

Architecture Diagram

Key Features

Ambient Provider Agent

  • Advanced Transcription

    • Riva transcription with speaker diarization and medical terminology lexicon boosting
    • Parakeet ASR for real-time diarized transcription
    • Support for both live conversations and retrospective analysis
  • Fast Medical Reasoning

    • Llama Nemotron reasoning capabilities deliver highest accuracy and lowest latency
    • Automated analysis of transcripts for clinical documentation
    • Autonomous SOAP note generation

Ambient Patient Agent

  • Comprehensive Speech Pipeline

    • Riva speech-to-text and text-to-speech capabilities
      • Parakeet 1.1b ASR Model for low-latency accurate transcription
      • Magpie Multilingual TTS Model for natural voice audio responses
    • NVIDIA ace-controller for speech pipeline orchestration
  • Intelligent Guardrails

    • NeMo Guardrails for safe and topically appropriate interactions
    • Context-aware response generation
    • Highly customizable configuration with options for guardrail-specific NIMs

Software Used in the Blueprint

NVIDIA NIMs and Toolkits

  • llama-3.3-nemotron-super-49b-instruct
  • llama-3_3-70b-instruct
  • llama-3_1-nemoguard-8b-content-safety
  • llama-3_1-nemoguard-8b-topic-control
  • parakeet-ctc-1_1b-asr
  • magpie-tts-multilingual

Other Software

  • LangChain

References

Minimum System Requirements

Note: Users may have to wait 5–10 minutes for the instance to start, depending on cloud availability.

Operating System & System Software

  • Ubuntu 22.04
  • Docker Version 28.1+ with Docker Compose plugin

Storage

  • Ambient Patient Agent: 302 GB (for self-hosted configuration)
  • Ambient Provider Agent: 325 GB (for self-hosted configuration)

Hardware Requirements

The ambient healthcare agents developer example supports the following hardware and system configurations:

Ambient Provider Agent

Self-Hosted Configuration:

ServiceUse CaseRecommended GPU
Riva ASR NIMAudio Transcription and Diarization1x various options including
L40, A100, and more (see modelcard)
Reasoning ModelMedical Note (SOAP) Generation2x H100 80 GB
or 4x A100 80 GB

NVIDIA API Catalog Configuration:
No GPU requirement when using public NVIDIA endpoints for NIM microservices (build.nvidia.com)

Ambient Patient Agent

Self-Hosted Configuration:

ServiceUse CaseRecommended GPU
Riva ASR NIMAudio Transcription1x various options including L40,
A100, and more (see modelcard)
Riva TTS NIMSpeech Synthesis1x various options including L40,
A100, and more (see modelcard)
NemoGuard Content Safety Model
(Optional for enabling NeMo Guardrails)
Content Safety1x options including A100, H100,
L40S, A6000 (see modelcard)
NemoGuard Topic Control Model
(Optional for enabling NeMo Guardrails)
Topic Control1x options including A100, H100,
L40S, A6000 (see modelcard)
Instruct ModelAgent Reasoning
and Tool Calling
2x H100 80 GB
or 4x A100 80GB (see modelcard)

NVIDIA API Catalog Configuration:
No GPU requirement when using public NVIDIA endpoints for NIM microservices (build.nvidia.com)

Security Considerations

Please be aware of the following security considerations when using this repository:

  • This repository and its contents is shared as a reference and is provided "as is". The security in the production environment is the responsibility of the end users deploying it. When deploying in a production environment, please have security experts review any potential risks and threats; define the trust boundaries, implement logging and monitoring capabilities, secure the communication channels, integrate AuthN & AuthZ with appropriate access controls, keep the deployment up to date, ensure the containers/source code are secure and free of known vulnerabilities.
  • A frontend that handles AuthN & AuthZ should be in place as missing AuthN & AuthZ could provide ungated access to customer models if directly exposed to e.g. the internet, resulting in either cost to the customer, resource exhaustion, or denial of service.
  • The repository doesn't require any privileged access to the system.
  • The end users are responsible for ensuring the availability of their deployment.
  • The end users are responsible for building the container images and keeping them up to date.
  • The end users are responsible for ensuring that OSS packages used by the developer example are current.
  • The logs from the agent backend and UI frontend containers are printed to standard out. They can include input prompts and output completions for development purposes. The end users are advised to handle logging securely and avoid information leakage for production use cases.
  • The agent backend and UI frontend containers may interact with local files for development purposes. The end users are advised to customize all file saving and uploading logic securely and avoid information leakage for production use cases.
  • Credential Management: Never hard-code sensitive credentials (API keys, passwords, etc.) in code or configuration files. Use environment variables or a secrets manager.
  • NGC API Key: The developer examples require and process an NVIDIA NGC API key. Treat your key as sensitive; do not expose it publicly or commit it to source control.
  • Network Exposure: By default, several services run locally and may expose network ports. Restrict access with firewalls or Docker network settings as appropriate for your environment.
  • User Data Protection: Uploaded audio files and generated medical notes may contain personally identifiable information (PII) or protected health information (PHI). Ensure secure storage and proper data handling in accordance with applicable regulations (e.g., HIPAA, GDPR).
  • Dependencies: Review all dependencies and container images for known vulnerabilities. Keep your dependencies up to date.
  • Container Security: Do not run containers with unnecessary privileges. Use the least privilege principle.
  • Vulnerability Reporting: If you discover any security issues or vulnerabilities in this repository, please follow the reporting instructions in the SECURITY.md file.

For more information, see the SECURITY.md file in the developer example repository.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.

License and Governing Terms

The API trial service is governed by the NVIDIA API Trial Terms of Service. The developer example software is governed by the Apache 2.0 License. Use of the NIM containers is governed by the NVIDIA Software License Agreement and Product Specific Terms for NVIDIA AI Products. Use of the ASR Parakeet 1.1b CTC en-US, ASR Parakeet CTC Riva 1.1b, Magpie TTS Multilingual, and Llama-3.3-70b-Instruct models is governed by the NVIDIA Community Model License Agreement. Use of the Llama-3.1-Nemoguard-8b-Topic-Control, Llama-3.1-Nemoguard-8b-Content-Safety and Llama-3.3-Nemotron-Super-49B-v1 models is governed by the NVIDIA Open Model License Agreement. Use of the Ace-Controller software is governed by the BSD 2-Clause License. ADDITIONAL INFORMATION: For Llama-3.1-Nemoguard-8b-Topic-Control and Llama-3.1-Nemoguard-8b-Content-Safety, Llama 3.1 Community License Agreement. For Llama-3.3-Nemotron-Super-49b-v1 and Llama-3.3-70b-Instruct, Llama 3.3 Community License Agreement. Built with Llama.

nvidia

Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

llama-3_3-nemotron-super-49b-v1•llama-3_3-70b-instruct•llama-3_1-nemoguard-8b-content-safety•llama-3_1-nemoguard-8b-topic-control•parakeet-ctc-1_1b-asr•magpie-tts-multilingual
agent blueprintLaunchableblueprintNVIDIA AIllmnemonim
View GitHubDeploy on Cloud