NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters

  • Free Endpoint
    80
  • Partner Endpoint
    64
  • Download Available
    116
  • Code Generation
    22
  • Retrieval Augmented Generation
    15
  • Drug Discovery
    14
  • Image-to-Text
    12
  • Object Detection
    9
  • Deep Infra
    47
  • Together AI
    38
  • Bitdeer AI
    19
  • GMI Cloud
    18
  • CoreWeave
    11
  • NVIDIA
    95
  • Mistral AI
    14
  • Meta
    13
  • Microsoft
    12
  • Qwen
    10
  • Enterprise
    1
  • NVIDIA BioNemo
    1
  • 196 models
    Z.ai
    Downloadable

    glm-5.1

    GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.
    Agentic AI
    1d
    Z.ai
    Free Endpoint

    glm-4.7

    GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.
    Tool Calling
    687K
    2d
    NVIDIA
    Downloadable

    NVIDIA AI for Media Relighting

    Re-illuminate people in video to match target lighting from a 360 HDRI environment map.
    HDRI
    84
    2d
    NVIDIA
    Free Endpoint

    nemotron-3-content-safety

    Multilingual, multimodal model for detecting unsafe and toxic content.
    llm safety
    2.91K
    2d
    NVIDIA
    Free Endpoint

    synthetic-video-detector

    NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.
    broadcast
    95
    2d
    NVIDIA
    DownloadableFree Endpoint

    Active Speaker Detection

    Detect and track speaker identities across video frames.
    localization
    49
    2d
    NVIDIA

    LipSync

    Generative lip dubbing that syncs lips in a video to input audio.
    lipsync
    2d
    NVIDIA
    Downloadable

    ising-calibration-1-35b-a3b

    Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.
    Quantum
    32.67K
    4d
    Minimaxai
    Free Endpoint

    minimax-m2.7

    MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
    coding
    2.05M
    1w
    Google
    Downloadable

    gemma-4-31b-it

    Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.
    coding
    2.36M
    2w
    NVIDIA
    Downloadable

    llama-nemotron-rerank-vl-1b-v2

    GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
    nemo retriever
    2.8K
    2w
    NVIDIA
    Enterprise

    Build A Generative Protein Binder Design Pipeline

    This blueprint shows how generative AI and accelerated NIM microservices can design protein binders smarter and faster.
    NVIDIA BioNemo
    2.97K
    3w
    Mistral AI
    Downloadable

    mistral-small-4-119b-2603

    Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context
    code generation
    8.09M
    1mo
    NVIDIA
    Free Endpoint

    nemotron-voicechat

    Nemotron 3 Voicechat
    English
    6.02K
    1mo
    NVIDIA
    Downloadable

    nemotron-asr-streaming

    Real-time speech recognition for English
    Automatic Speech Recognition
    23.91K
    1mo
    Black-forest-labs
    Downloadable

    flux.2-klein-4b

    FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed
    Text-to-Image
    114K
    1mo
    NVIDIA
    Downloadable

    nemotron-ocr-v1

    Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.
    Table Extraction
    1.7M
    1mo
    NVIDIA
    Downloadable

    nemotron-3-super-120b-a12b

    Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
    MoE
    50.05M
    1mo
    NVIDIA
    Downloadable

    llama-nemotron-rerank-1b-v2

    GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
    nemo retriever
    177K
    1mo
    Qwen
    Downloadable

    qwen3.5-122b-a10b

    122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
    tool calling
    8.92M
    1mo
    NVIDIA
    Downloadable

    nemotron-table-structure-v1

    Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
    Object Detection
    19.33K
    1mo
    NVIDIA
    Downloadable

    nemotron-page-elements-v3

    Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
    Object Detection
    43.98K
    1mo
    NVIDIA
    Downloadable

    nemotron-graphic-elements-v1

    Model for object detection, fine-tuned to detect charts, tables, and titles in documents.
    Object Detection
    18.6K
    1mo
    NVIDIA
    Downloadable

    llama-nemotron-embed-1b-v2

    Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.
    Text-to-Embedding
    1.86M
    1mo
    Items per page
    ...
    of 9 pages