NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

NVIDIA AI-Q Blueprint

The NVIDIA AI-Q Blueprint (pronounced IQ) enables developers to build fully customizable AI agents that they own, inspect and control. Built on LangChain DeepAgents and accelerated by the NVIDIA NeMo Agent Toolkit, AI-Q is an open reference example for building AI agents. It gives you both quick answers, citations, and in-depth, report-style research in one system, with benchmarks and evaluation harnesses so you can measure quality and improve over time.

NVIDIA AI-Q is a top ranked AI agent for deep research on both the DeepResearch Bench and DeepResearch Bench II leaderboards.

Architecture

Key Features

AI-Q is powered by a LangGraph-based state machine. While the system functions as a fully orchestrated workflow, each agent can also be executed as a standalone component. Here are the key components:

  • Orchestration node: Classifies intent (meta vs. research), produces meta responses when needed, and sets depth (shallow vs. deep) in one step
  • Shallow research agent: Bounded tool-augmented research optimized for speed
  • Deep research agent: Multi-phase research with planning, iteration, and citation management
  • Workflow configuration: YAML configs define agents, tools, LLMs, and routing behavior so you can tune workflows without code changes
  • Modular workflows: All agents (orchestration node, shallow researcher, deep researcher, clarifier) are composable; each can run standalone or as part of the full pipeline
  • Evaluation harnesses: Built-in benchmarks (e.g., FreshQA, DeepResearch) and evaluation scripts to measure quality and iterate on prompts and agent architecture
  • Frontend options: Run via CLI, web UI, or async jobs
  • Deployment options: Deployment assets for Docker Compose as well as Helm

Prerequisites

Required:

  • Python 3.11+
  • UV package manager
  • Node.js 22+ and npm (optional, for web UI mode)
  • API key for your chosen provider(s):
    • NVIDIA API key from NVIDIA (for NVIDIA NIM models)
    • OpenAI API key (for OpenAI models)
    • Anthropic API key (for Claude models)
    • Google API key (for Gemini models)

Optional:

  • Tavily API key (for web search functionality)
  • Serper API key (for academic paper search functionality)

System Requirements

Local / Hybrid Development

  • Developer machine to run AI-Q instance (no local GPU required)
  • LlamaIndex (optional local RAG)
  • Provider / Services / RAG APIs

Fully Self-Hosted / On-Prem

  • Server for AI-Q instances
  • NVIDIA nemotron 3 super 120b (agent)
  • GPT OSS 120B (agent)
  • NVIDIA nemotron 3 nano 30b (agent)
  • NVIDIA nemotron mini 4b instruct (optional document summary)
  • NVIDIA RAG Blueprint (optional RAG)
    • llama 3.3 nemotron super 49b v1.5
    • llama 3.2 nv embedqa 1b v2
    • llama 3.2 nv rerankqa 1b v2
    • NeMo Retriever Page Elements
    • NeMo Retriever Table Structure
    • NeMo Retriever Graphic Elements
    • NeMo Retriever OCR
  • LlamaIndex (optional RAG)
    • llama nemotron embed vl 1b v2
    • nemotron nano 12b v2 vl

Hosted Service

  • Server for AI-Q instances
  • Provider APIs
  • NVIDIA RAG Blueprint (optional)
  • LlamaIndex (optional)

Hardware Requirements

Hardware requirements will vary depending on configured options. Refer to the below documentation for more details.

  • NVIDIA nemotron 3 super support
  • NVIDIA nemotron 3 nano support
  • NVIDIA nemotron mini 4b instruct card
  • NVIDIA nemotron embed support matrix
  • NVIDIA nemotron nano vl support matrix
  • NVIDIA RAG Blueprint support matrix

Software Components

NVIDIA Technology

  • NVIDIA NeMo Agent Toolkit
  • NVIDIA NIM
  • NVIDIA RAG Blueprint

3rd Party Software

  • LangChain for workflows
  • Tavily for web search
  • Serper for paper search (Google Scholar)
  • LlamaIndex for RAG

License

This project is licensed under the Apache License 2.0. See License for details.

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility, and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their supporting model team to ensure the models meet requirements for the relevant industry and use case and address unforeseen product misuse. For more detailed information on ethical considerations for the models, please see the Model Card++ Explainability, Bias, Safety & Security, and Privacy Subcards. Please report security vulnerabilities or NVIDIA AI concerns here.

Terms of Use

This service is governed by the NVIDIA API Trial Terms of Service.

nvidia

NVIDIA AI-Q Blueprint for intelligent agents

AI agents that connect, retrieve, and reason on enterprise data—making information accessible, actionable, and intelligent.

gpt-oss-120b•nemotron-3-super-120b-a12b•nemotron-3-nano-30b-a3b•nemotron-mini-4b-instruct
AgentsEnterpriseNIMNeMoNemotron
View GitHubDeploy on Cloud