NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

33 results for

Filters

  • Download Available
    19
  • API Endpoint
    7
  • Launchable
    5
  • Enterprise
    3
  • Object Detection
    3
  • Retrieval Augmented Generation
    3
  • Text-to-Embedding
    2
  • Image-to-Text
    1
  • NVIDIA
    32
  • Mistral AI
    1
  • NVIDIA AI
    6
  • Mistral AI

    mistral-nemotron

    Built for agentic workflows, this model excels in coding, instruction following, and function calling
    Model
    language generation
    745K
    9mo
    NVIDIA

    nemotron-parse

    Cutting-edge vision-language model exceling in retrieving text and metadata from images.
    Model
    text and table extraction
    416K
    4mo
    NVIDIA

    Nemotron Voice Agent

    A voice agent that uses the Nemotron model to generate responses to voice commands.
    Blueprint
    Voice Agent
    1w
    NVIDIA

    cosmos-nemotron-34b

    Multi-modal vision-language model that understands text/img/video and creates informative responses
    Model
    VLM
    6
    1y
    NVIDIA

    nemotron-graphic-elements-v1

    Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
    Model
    Object Detection
    3.65K
    6d
    NVIDIA

    nemotron-mini-4b-instruct

    Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling
    Model
    chat
    539K
    1y
    NVIDIA

    nemotron-page-elements-v3

    Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
    Model
    Object Detection
    3.93K
    6d
    NVIDIA

    nemotron-table-structure-v1

    Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
    Model
    Object Detection
    2.94K
    6d
    DGX Spark
    30 MIN

    Nemotron-3-Nano with llama.cpp

    Run Nemotron-3-Nano-30B model using llama.cpp on DGX Spark
    Playbook
    Nemotron
    2mo
    NVIDIA

    nemotron-content-safety-reasoning-4b

    A context‑aware safety model that applies reasoning to enforce domain‑specific policies.
    Model
    NeMo Guardrails
    537K
    1mo
    NVIDIA

    nemotron-nano-12b-v2-vl

    Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.
    Model
    language generation
    1.45M
    4mo
    NVIDIA

    llama-3.1-nemotron-70b-reward

    Leaderboard topping reward model supporting RLHF for better alignment with human preferences.
    Model
    Text-to-text
    431K
    1y
    NVIDIA

    llama-nemotron-embed-1b-v2

    Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
    Model
    Retrieval Augmented Generation
    282K
    6d
    NVIDIA

    llama-nemotron-rerank-1b-v2

    GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
    Model
    nemo retriever
    1.09K
    4d
    NVIDIA

    nemotron-3-nano-30b-a3b

    Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
    Model
    MoE
    12.92M
    2mo
    NVIDIA

    nvidia-nemotron-nano-9b-v2

    High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.
    Model
    thinking budget
    781K
    6mo
    NVIDIA

    llama-3.1-nemotron-nano-4b-v1.1

    State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents
    Model
    edge
    101K
    8mo
    NVIDIA

    llama-3.1-nemotron-nano-8b-v1

    Leading reasoning and agentic AI accuracy model for PC and edge.
    Model
    chat
    631K
    8mo
    NVIDIA

    llama-3.1-nemotron-nano-vl-8b-v1

    Multi-modal vision-language model that understands text/img and creates informative responses
    Model
    doc intelligence
    8.34M
    8mo
    NVIDIA

    llama-3.1-nemotron-safety-guard-8b-v3

    Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs
    Model
    content moderation
    556K
    4mo
    NVIDIA

    llama-3.1-nemotron-ultra-253b-v1

    Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.
    Model
    chat
    7.99M
    8mo
    NVIDIA

    llama-3.3-nemotron-super-49b-v1

    High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
    Model
    chat
    1.14M
    7mo
    NVIDIA

    llama-3.3-nemotron-super-49b-v1.5

    High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.
    Model
    chat
    4.89M
    7mo
    NVIDIA

    llama-nemotron-embed-vl-1b-v2

    Multimodal question-answer retrieval representing user queries as text and documents as images.
    Model
    nemo retriever
    769K
    4w
    Items per page
    of 2 pages