NVIDIA
Explore Models Blueprints GPUs Docs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

Biomedical AI-Q Research Agent

The ability to connect real-world data to reasoning models unlocks new opportunities to improve the efficiency and accuracy of various clinical development processes, including R&D, literature review, protocol generation, clinical trial screening, and pharmacovigilance. In addition, the ability to create tools for LLMs to use gives agents the ability to interact with the domain-specific services that researchers use, such as domain specific models or APIs. Reasoning LLMs enable state-of-the-art planning, validation, and insight generation capabilities during the research process, while the agentic framework AgentIQ accelerates the production-ready agent workflows, and their unique data sources can be easily composed together to create specialized, production-ready agentic workflows. The underlying NIMs run securely and privately on any compute, thus protecting the privacy of a clinical development organization’s valuable real-world data. The combination of these capabilities in Biomedical AI-Q Research Agent increases the number and quality of insights generated from specialized and often proprietary biomedical data sources, improves the traceability and interpretability of experiments and research, decreases human hours spent generating and validating reports for regulatory and downstream knowledge building, and improves the ability of leaders and researchers to make informed decisions based on the knowledge generated.

Architecture Diagram

Architecture Diagram

Key Features

Adapt Reasoning Agents to Scientific Workflows

  • Connect specialized workflows and tools, such as biomedical foundation models, to advanced reasoning models
  • Protect proprietary data and results through deploying on secured compute

Ease of development

  • Flexibly choose, and connect agents and tools best suited for each task
  • Evaluate, audit and debug agentic workflow to identify opportunities for optimization

Advanced semantic query

  • Multimodal PDF data extraction and retrieval with NVIDIA NeMo Retriever
  • 15x faster ingestion of enterprise data
  • 3x lower retrieval latency
  • Multilingual and cross-lingual
  • Reranking to further improve accuracy
  • GPU-accelerated index creation and search

Fast reasoning

  • Llama Nemotron reasoning capabilities delivering the highest accuracy and lowest latency for analyzing datasets, identifying patterns, and proposing solutions

Software Used in the Blueprint

NVIDIA NIMs and Toolkits

  • llama-3.3-nemotron-super-49b-instruct
  • llama-3.2-nv-embedqa-1b-v2
  • llama-3.2-nv-rerankqa-1b-v2
  • nemoretriever-graphic-elements-v1
  • nemoretriever-table-structure-v1
  • nemoretriever-page-elements-v2
  • nemoretriever-parse
  • paddleocr
  • llama-3_3-70b-instruct
  • MolMIM
  • DiffDock
  • Agent Intelligence open-source toolkit
  • NVIDIA RAG Blueprint
  • NeMo Retriever Extraction

Other Software

  • Tavily
  • LangChain
  • Milvus database (accelerated with NVIDIA cuVS)
  • PubChemPy
  • RCSB-API

References

  • NVIDIA AI-Q Research Assistant Blueprint
  • NVIDIA BioNeMo Virtual Screening Blueprint
  • NVIDIA RAG Blurprint

Minimum System Requirements

Users may have to wait 5–10 minutes for the instance to start, depending on cloud availability.

Disk Space

  • 435 GB minimum

OS Requirements

  • Ubuntu 22.04 OS

Deploy Options

  • Docker Compose

Drivers

NVIDIA Container ToolKit
GPU Driver - 530.30.02 or later
CUDA version - 12.6 or later

Hardware Requirements

The biomedical research agent blueprint supports the following hardware and system configurations:

For running all services locally

UseService(s)Recommended GPU*
Nemo Retriever Microservices for multi-modal document ingestgraphic-elements, table-structure, paddle-ocr, nv-ingest, embedqa1 x H100 80GB*
1 x A100 80GB
Reasoning Model for Report Generation and RAG Q&A Retrievalllama-3.3-nemotron-super-49b-v1 with a FP8 profile1 x H100 80 GB*
2 x A100 80GB
Instruct Model for Report Generationllama-3.3-70b-instruct2 x H100 80GB*
4 x A100 80GB
Generative Model for Small Molecule Drug Developmentnvcr.io/nim/nvidia/molmim:1.0.0Single Ampere/L40 GPU with at least 3 GB memory
(doc)
Generative Model for Molecular Dockingnvcr.io/nim/mit/diffdock:2.1.01 x H100 80GB
1 x A100 40GB
1 x A6000 48GB
1 x A10 24GB
1 x L40S 48GB
(doc)
TotalEntire Biomedical AIQ Research Agent5 x H100 80GB*
8 x A100 80GB

*This recommendation is based on the configuration used to test the blueprint. For alternative configurations, view the RAG blueprint documentation.

For running with hosted NVIDIA NIM Microservices

This blueprint can be run entirely with hosted NVIDIA NIM Microservices, see https://build.nvidia.com/ for details.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns here.

License

Use of the models in this blueprint is governed by the NVIDIA AI Foundation Models Community License.

Terms of Use

GOVERNING TERMS: The NVIDIA Biomedical AI-Q Research Agent Developer Blueprint and Biomedical AI-Q Research Agent Brev launchable are governed by the Apache 2.0 License. The remaining software and materials are governed by the NVIDIA Software License Agreement and Product-Specific Terms for NVIDIA AI Products; except as follows: (a) the models, other than the Llama-3.3-Nemotron-Super-49B-v1 model, are governed by the NVIDIA Community Model License; (b) the Llama-3.3-Nemotron-Super-49B-v1 model is governed by the NVIDIA Open Model License Agreement; (c) the NeMo Retriever extraction is governed by the Apache 2.0 license, and (d) data from the RCSB Protein Data Bank is governed by CC0 1.0 Universal.

ADDITIONAL INFORMATION: For NVIDIA Retrieval QA Llama 3.2 1B Reranking v2 model, NeMo Retriever Graphic Elements v1 model, and NVIDIA Retrieval QA Llama 3.2 1B Embedding v2: Llama 3.2 Community License Agreement. For Llama-3.3-70b-Instruct model, Llama 3.3 Community License Agreement. Built with Llama.

nvidia

Biomedical AI-Q Research Agent Blueprint

Build advanced AI agents within the biomedical domain using the AI-Q Blueprint and the BioNeMo Virtual Screening Blueprint

llama-3_3-nemotron-super-49b-v1•llama-3_2-nv-embedqa-1b-v2•nemoretriever-graphic-elements-v1•nemoretriever-table-structure-v1•nemoretriever-page-elements-v2•nemoretriever-parse•paddleocr•llama-3_3-70b-instruct•diffdock•molmim-generate
agent blueprintblueprintretrieval-augmented generationlaunchablellm
View GitHubDeploy on Cloud