# Biomedical AI-Q Research Agent The ability to connect real-world data to reasoning models unlocks new opportunities to improve the efficiency and accuracy of various clinical development processes, including R&D, literature review, protocol generation, clinical trial screening, and pharmacovigilance. In addition, the ability to create tools for LLMs to use gives agents the ability to interact with the domain-specific services that researchers use, such as domain specific models or APIs. Reasoning LLMs enable state-of-the-art planning, validation, and insight generation capabilities during the research process, while the agentic framework AgentIQ accelerates the production-ready agent workflows, and their unique data sources can be easily composed together to create specialized, production-ready agentic workflows. The underlying NIMs run securely and privately on any compute, thus protecting the privacy of a clinical development organization’s valuable real-world data. The combination of these capabilities in Biomedical AI-Q Research Agent increases the number and quality of insights generated from specialized and often proprietary biomedical data sources, improves the traceability and interpretability of experiments and research, decreases human hours spent generating and validating reports for regulatory and downstream knowledge building, and improves the ability of leaders and researchers to make informed decisions based on the knowledge generated. ## Architecture Diagram ![Architecture Diagram](https://assets.ngc.nvidia.com/products/api-catalog/biomedical-aiq-research-agent/diagram.jpg) ## Key Features Adapt Reasoning Agents to Scientific Workflows - Connect specialized workflows and tools, such as biomedical foundation models, to advanced reasoning models - Protect proprietary data and results through deploying on secured compute Ease of development - Flexibly choose, and connect agents and tools best suited for each task - Evaluate, audit and debug agentic workflow to identify opportunities for optimization Advanced semantic query - Multimodal PDF data extraction and retrieval with NVIDIA NeMo Retriever - 15x faster ingestion of enterprise data - 3x lower retrieval latency - Multilingual and cross-lingual - Reranking to further improve accuracy - GPU-accelerated index creation and search Fast reasoning - Llama Nemotron reasoning capabilities delivering the highest accuracy and lowest latency for analyzing datasets, identifying patterns, and proposing solutions ## Software Used in the Blueprint ### NVIDIA NIMs and Toolkits - [llama-3.3-nemotron-super-49b-instruct](https://build.nvidia.com/nvidia/llama-3_3-nemotron-super-49b-v1) - [llama-3.2-nv-embedqa-1b-v2](https://build.nvidia.com/nvidia/llama-3_2-nv-embedqa-1b-v2) - [llama-3.2-nv-rerankqa-1b-v2](https://build.nvidia.com/nvidia/llama-3_2-nv-rerankqa-1b-v2) - [nemoretriever-graphic-elements-v1](https://build.nvidia.com/nvidia/nemoretriever-graphic-elements-v1) - [nemoretriever-table-structure-v1](https://build.nvidia.com/nvidia/nemoretriever-table-structure-v1) - [nemoretriever-page-elements-v2](https://build.nvidia.com/nvidia/nemoretriever-page-elements-v2) - [nemoretriever-parse](https://build.nvidia.com/nvidia/nemoretriever-parse) - [paddleocr](https://build.nvidia.com/baidu/paddleocr) - [llama-3_3-70b-instruct](https://build.nvidia.com/meta/llama-3_3-70b-instruct) - [MolMIM](https://build.nvidia.com/nvidia/molmim-generate) - [DiffDock](https://build.nvidia.com/mit/diffdock) - [Agent Intelligence open-source toolkit](https://github.com/NVIDIA/AIQToolkit) - [NVIDIA RAG Blueprint](https://github.com/NVIDIA-AI-Blueprints/rag) - [NeMo Retriever Extraction](https://github.com/NVIDIA/nv-ingest) ### Other Software - Tavily - LangChain - Milvus database (accelerated with NVIDIA cuVS) - PubChemPy - RCSB-API ## References - [NVIDIA AI-Q Research Assistant Blueprint](https://build.nvidia.com/nvidia/aiq) - [NVIDIA BioNeMo Virtual Screening Blueprint](https://build.nvidia.com/nvidia/generative-virtual-screening-for-drug-discovery) - [NVIDIA RAG Blurprint](https://build.nvidia.com/nvidia/build-an-enterprise-rag-pipeline) ## Minimum System Requirements Users may have to wait 5–10 minutes for the instance to start, depending on cloud availability. ### Disk Space - 435 GB minimum ### OS Requirements - Ubuntu 22.04 OS ### Deploy Options - Docker Compose ### Drivers NVIDIA Container ToolKit GPU Driver - 530.30.02 or later CUDA version - 12.6 or later ### Hardware Requirements The biomedical research agent blueprint supports the following hardware and system configurations: #### For running all services locally | Use | Service(s) | Recommended GPU\* | | :---- | :---- | :---- | | Nemo Retriever Microservices for multi-modal document ingest | graphic-elements, table-structure, paddle-ocr, nv-ingest, embedqa | 1 x H100 80GB\*
1 x A100 80GB | | Reasoning Model for Report Generation and RAG Q\&A Retrieval | llama-3.3-nemotron-super-49b-v1 with a FP8 profile | 1 x H100 80 GB\*
2 x A100 80GB | | Instruct Model for Report Generation | llama-3.3-70b-instruct | 2 x H100 80GB\*
4 x A100 80GB | | Generative Model for Small Molecule Drug Development | nvcr.io/nim/nvidia/molmim:1.0.0 | Single Ampere/L40 GPU with at least 3 GB memory
([doc](https://docs.nvidia.com/nim/bionemo/molmim/latest/prerequisites.html)) | | Generative Model for Molecular Docking | nvcr.io/nim/mit/diffdock:2.1.0 | 1 x H100 80GB
1 x A100 40GB
1 x A6000 48GB
1 x A10 24GB
1 x L40S 48GB
([doc](https://docs.nvidia.com/nim/bionemo/diffdock/latest/getting-started.html#hardware)) | | **Total** | Entire Biomedical AIQ Research Agent | 5 x H100 80GB\*
8 x A100 80GB | #### \*This recommendation is based on the configuration used to test the blueprint. For alternative configurations, view the [RAG blueprint documentation](https://github.com/NVIDIA-AI-Blueprints/rag?tab=readme-ov-file#minimum-system-requirements). #### For running with hosted NVIDIA NIM Microservices This blueprint can be run entirely with hosted NVIDIA NIM Microservices, see [https://build.nvidia.com/](https://build.nvidia.com/) for details. ## Ethical Considerations NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and addresses unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/). ## License Use of the models in this blueprint is governed by the [NVIDIA AI Foundation Models Community License](https://docs.nvidia.com/ai-foundation-models-community-license.pdf). ## Terms of Use GOVERNING TERMS: The NVIDIA Biomedical AI-Q Research Agent Developer Blueprint and Biomedical AI-Q Research Agent Brev launchable are governed by the Apache 2.0 License. The remaining software and materials are governed by the [NVIDIA Software License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-software-license-agreement/) and [Product-Specific Terms for NVIDIA AI Products](https://www.nvidia.com/en-us/agreements/enterprise-software/product-specific-terms-for-ai-products/); except as follows: (a) the models, other than the Llama-3.3-Nemotron-Super-49B-v1 model, are governed by the [NVIDIA Community Model License](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-community-models-license/); (b) the Llama-3.3-Nemotron-Super-49B-v1 model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/); (c) the NeMo Retriever extraction is governed by the Apache 2.0 license, and (d) data from the RCSB Protein Data Bank is governed by [CC0 1.0 Universal](https://www.rcsb.org/pages/policies#usagePolicies). ADDITIONAL INFORMATION: For NVIDIA Retrieval QA Llama 3.2 1B Reranking v2 model, NeMo Retriever Graphic Elements v1 model, and NVIDIA Retrieval QA Llama 3.2 1B Embedding v2: [Llama 3.2 Community License Agreement](https://www.llama.com/llama3_2/license/). For Llama-3.3-70b-Instruct model, [Llama 3.3 Community License Agreement](https://www.llama.com/llama3_3/license/). Built with Llama.