
nvidia
Build an AI Agent for Enterprise Research
Build a custom enterprise research assistant powered by state-of-the-art models that process and synthesize multimodal data, enabling reasoning, planning, and refinement to generate comprehensive reports.
Build Your Personal Research Assistant
Use this blueprint to create a custom AI researcher that can operate anywhere, informed by your own data sources, that can synthesize hours of research in minutes. The AI-Q NVIDIA Blueprint enables developers to connect AI agents to enterprise data and use reasoning and tools to distill in-depth source materials with efficiency and precision. Using AI-Q, agents summarize diverse data sets, generating tokens 5x faster and ingesting large scale data 15x faster with better semantic accuracy. The blueprint uses the open-source NVIDIA NeMo Agent Toolkit for evaluation and profiling of the agent workflow, enabling easier optimization and interoperability of agents, tools, and data sources.
Architecture Diagram
Key Features
- Ease development and optimization with NVIDIA NeMo Agent Toolkit
- Unify, evaluate, audit, and debug agentic workflows across different frameworks to identify opportunities for optimization
- Flexibly choose and connect agents and tools best suited for each task
- Advanced semantic query with NVIDIA NeMo Retriever
- Multimodal PDF data extraction and retrieval
- 15x faster ingestion of enterprise data
- 3x lower retrieval latency
- Multilingual and cross-lingual
- Reranking to further improve accuracy
- GPU-accelerated index creation and search
- Fast reasoning with Llama Nemotron
- Reasoning capabilities delivering the highest accuracy and lowest latency for analyzing data sources, identifying patterns, and proposing solutions
Minimum System Requirements
OS Requirements
- Ubuntu 22.04 OS
Deployment Options
- Docker (Docker-Compose)
- Kubernetes (Helm)
Hardware Requirements
-
Docker-Compose
- Generally 2xH100 (80GB) for RAG + 2xH100 (80GB) for AI-Q Research Assistant
- Generally 3xA100 (80GB) for RAG + 4xA100 (80GB) for AI-Q Research Assistant
- Generally 3xB200 for RAG + 2xB200 for AI-Q Research Assistant
- Generally 2xRTX6000 Pro RAG + 2xRTX6000 Pro AI-Q Research Assistant
-
Helm
- Generally 8xH100 (80GB) for RAG + 2xH100 (80GB) for AI-Q Research Assistant
- Generally 9xA100 (80GB) for RAG + 4xA100 (80GB) for AI-Q Research Assistant
- Generally 9xB200 for RAG + 2xB200 for AI-Q Research Assistant
- Generally 8xRTX6000 Pro for RAG + 2xRTX6000 Pro for AI-Q Research Assistant
-
For more details and requirements, including MIG options, please review the Get-Started Docs
-
The blueprint can be modified to use additional NIM microservices hosted by NVIDIA
Software used in this blueprint
NVIDIA Technology
- llama-3.3-nemotron-super-49b-v1_5
- llama-3.3-70b-instruct
- llama-3.2-nv-embedqa-1b-v2
- llama-3.2-nv-rerankqa-1b-v2
- nemoretriever-graphic-elements-v1
- nemoretriever-table-structure-v1
- nemoretriever-page-elements-v2
- paddleocr
- NVIDIA NeMo Agent Toolkit
- NVIDIA RAG Blueprint
- NeMo Retriever Extraction
3rd Party Software
Ethical Considerations
NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and address unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI concerns here.
Terms of Use
GOVERNING TERMS: This service is governed by the NVIDIA API Trial Terms of Service
