Try NVIDIA NIM APIs

Distill and deploy domain-specific AI models from unstructured financial data to generate market signals efficiently—scaling your workflow with the NVIDIA Data Flywheel Blueprint for high-performance, cost-efficient experimentation.

blueprint developer example nim nvidia ai Launchable Nemotron algorithmic trading llm financial services data flywheel

nvidia Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

agent blueprint blueprint nim Launchable nemo llm NVIDIA AI

nvidia nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

text and table extraction document parsing supported language - english

nvidia nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

language generation chat Image-to-Text vision assistant visual question answering

nvidia llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderation llm safety multilingual guard model multilingual content safety nemoguard

cyborg Cyborg Enterprise RAG

Securely extract, embed, and index multimodal data with encryption in-use for fast, accurate semantic search.

NIM Launchable Blueprint Retrieval-Augmented Generation NeMo Retriever

nvidia llama-3_2-nemoretriever-300m-embed-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Retrieval Augmented Generation Text-to-Embedding NeMo Retriever

nvidia Retail Shopping Assistant

Elevate Shopping Experiences Online and In Stores.

blueprint nemo retriever nim Launchable Retrieval-Augmented Generation NVIDIA AI

nvidia nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

thinking budget chat reasoning

nvidia nemoretriever-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

Optical Character Recognition Table Extraction nemo retriever data ingestion extraction

nvidia Streaming Data to RAG

Sensor-captured radio enables real-time awareness, AI-driven analytics for actionable, searchable insights.

blueprint NIM Riva Launchable RAG NVIDIA AI NeMo Retriever

nvidia llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

chat math advanced reasoning instruction following function calling

nvidia llama-3_2-nemoretriever-300m-embed-v1

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Retrieval Augmented Generation Text-to-Embedding NeMo Retriever

nvidia nemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

Optical Character Recognition Table Extraction nemo retriever data ingestion extraction

nvidia llama-3.2-nemoretriever-500m-rerank-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nemo retriever Retrieval Augmented Generation reranking

nvidia llama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

nemo retriever embedding Retrieval Augmented Generation Text-to-Embedding

mistralai mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

language generation chat instruction following function calling

nvidia Refine AI Agents through Continuous Model Distillation with Data Flywheels

Build a data flywheel, with NVIDIA NeMo microservices, that continuously optimizes AI agents for latency and cost — while maintaining accuracy targets.

NIM Launchable Data Flywheel Blueprint Enterprise NeMo microservices NVIDIA AI

iguazio AI Orchestration for Data Flywheel

Orchestrate AI agents for data flywheel with MLRun and NVIDIA NeMo microservices.

Orchestration Launchable AI Agents Data Flywheel Blueprint Partner NVIDIA AI

nvidia Safety for Agentic AI

Improve safety, security, and privacy of AI systems at build, deploy and run stages.

security Launchable Blueprint safety privacy Nemo Guardrails open models NVIDIA AI

nvidia llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

doc intelligence chat multiple image understanding OCR

nvidia llama-3.1-nemotron-nano-4b-v1.1

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

edge tool calling chat reasoning math

nvidia llama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

chat math advanced reasoning instruction following function calling

nvidia Build an AI Agent for Enterprise Research

Build a custom enterprise research assistant powered by state-of-the-art models that process and synthesize multimodal data, enabling reasoning, planning, and refinement to generate comprehensive reports.

NIM Launchable Llama Nemotron Reasoning Blueprint Enterprise Retrieval-Augmented Generation NVIDIA AI NeMo Retriever

nvidia llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

chat math advanced reasoning instruction following function calling

nvidia llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

chat math advanced reasoning instruction following function calling

nvidia nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

nemo retriever Embedding Retrieval Augmented Generation

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion run-on-rtx

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion run-on-rtx

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion run-on-rtx

nvidia nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

optical character recognition nemo retriever data ingestion table extraction supported language - english

nvidia Build an Enterprise RAG Pipeline Blueprint

Power fast, accurate semantic search across multimodal enterprise data with NVIDIA’s RAG Blueprint—built on NeMo Retriever and Nemotron models—to connect your agents to trusted, authoritative sources of knowledge.

NIM Launchable Nemotron Blueprint Enterprise Retrieval-Augmented Generation NVIDIA AI NeMo Retriever

nvidia llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

Dialogue Safety LLM safety Guard Model Content safety

nvidia nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

LLM Security Jailbreak Detection Prompt Injection NVIDIA NIM

nvidia llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

LLM safety content moderation Guard model Content safety

nvidia cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

VLM Vision language model image caption image to text

nvidia llama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

nemo retriever run-on-rtx embedding Retrieval Augmented Generation Text-to-Embedding

nvidia llama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

nemo retriever run-on-rtx Retrieval Augmented Generation reranking

university-at-buffalo cached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retriever Chart Element Detection Image-To-Text

nvidia nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Data ingestion Chart Detection nemo retriever Table Detection run-on-rtx extraction

baidu paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

Optical Character Recognition Table Extraction Optical Character Detection nemo retriever data ingestion run-on-rtx extraction

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

Indic chat Text-to-Text Language Generation

nvidia llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

Text-to-text Reward Model RLHF

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chat Text-to-Text Language Generation

nvidia mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generation text-to-text chat small language model

nvidia nv-rerankqa-mistral-4b-v3

Multilingual text reranking model.

nemo retriever Reranking Retrieval Augmented Generation

nvidia nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

Embedding run-on-rtx Retrieval Augmented Generation Nemo retriever Text-to-Embedding

nvidia nv-embedqa-mistral-7b-v2

Multilingual text question-answering retrieval, transforming textual information into dense vector representations.

nemo retriever Embedding Retrieval Augmented Generation

snowflake arctic-embed-l

Optimized community model for text embedding.

nemo retriever embedding Retrieval Augmented Generation Text-to-Embedding