NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters

  • Free Endpoint
    47
  • Partner Endpoint
    55
  • Download Available
    113
  • Retrieval Augmented Generation
    14
  • Drug Discovery
    13
  • Image-to-Text
    12
  • Code Generation
    10
  • Speech-to-Text
    9
  • Deep Infra
    41
  • Together AI
    27
  • Bitdeer AI
    18
  • GMI Cloud
    16
  • CoreWeave
    9
  • NVIDIA
    83
  • Meta
    11
  • Mistral AI
    11
  • Qwen
    8
  • Google
    6
  • A100 SXM4 80GB
    1
  • B200
    1
  • GB200
    1
  • GH200 144G HBM3e
    1
  • H100 80GB HBM3
    1
  • 157 models
    Moonshotai
    Downloadable

    kimi-k2.6

    1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.
    Multimodal
    Items per page
    of 7 pages
    734K
    1w
    Qwen
    Downloadable

    qwen-image

    Qwen-Image is a text-to-image foundation model with advanced multilingual text rendering.
    Text-to-Image
    1w
    Qwen
    Downloadable

    qwen-image-edit

    Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.
    Text-to-Image
    1w
    Mistral AI
    Downloadable

    mistral-medium-3.5-128b

    A high performing model for text generation, coding and agentic use cases
    coding
    666K
    1w
    NVIDIA
    Downloadable

    nemotron-3-nano-omni-30b-a3b-reasoning

    Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.
    Image-to-Text
    3.01M
    1w
    DeepSeek AI
    Downloadable

    deepseek-v4-flash

    DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.
    coding
    4.07M
    2w
    DeepSeek AI
    Downloadable

    deepseek-v4-pro

    DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.
    Moe
    3.39M
    2w
    Z.ai
    Downloadable

    glm-5.1

    GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.
    Agentic AI
    10.26M
    2w
    Z.ai
    Deprecation in 7dFree Endpoint

    glm-4.7

    GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.
    Tool Calling
    9.77M
    3w
    NVIDIA
    Downloadable

    NVIDIA AI for Media Relighting

    Re-illuminate people in video to match target lighting from a 360 HDRI environment map.
    HDRI
    442
    3w
    NVIDIA
    Free Endpoint

    nemotron-3-content-safety

    Multilingual, multimodal model for detecting unsafe and toxic content.
    llm safety
    49.98K
    3w
    NVIDIA
    DownloadableFree Endpoint

    synthetic-video-detector

    NVIDIA Synthetic Video Detector is an AI-powered micro-service for detecting AI‑generated (synthetic) videos.
    broadcast
    46.98K
    3w
    NVIDIA
    DownloadableFree Endpoint

    Active Speaker Detection

    Detect and track speaker identities across video frames.
    localization
    1.19K
    3w
    NVIDIA
    Downloadable

    LipSync

    Generative lip dubbing that syncs lips in a video to input audio.
    lipsync
    3w
    NVIDIA
    Downloadable

    ising-calibration-1-35b-a3b

    Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.
    Quantum
    193K
    3w
    Minimaxai
    Free Endpoint

    minimax-m2.7

    MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
    coding
    8.36M
    3w
    Google
    Downloadable

    gemma-4-31b-it

    Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.
    coding
    5.36M
    1mo
    NVIDIA
    Downloadable

    llama-nemotron-rerank-vl-1b-v2

    GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
    nemo retriever
    70.5K
    1mo
    Mistral AI
    Downloadable

    mistral-small-4-119b-2603

    Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context
    code generation
    9.95M
    1mo
    NVIDIA
    Free Endpoint

    nemotron-voicechat

    Nemotron 3 Voicechat
    English
    2.61K
    1mo
    NVIDIA
    Downloadable

    nemotron-asr-streaming

    Real-time speech recognition for English
    Automatic Speech Recognition
    21.37K
    1mo
    Black-forest-labs
    Downloadable

    flux.2-klein-4b

    FLUX.2-klein-4B is a distilled image generation and editing model, producing outputs at lighting speed
    Text-to-Image
    142K
    1mo
    NVIDIA
    Downloadable

    nemotron-ocr-v1

    Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.
    Table Extraction
    930K
    1mo
    NVIDIA
    Downloadable

    nemotron-3-super-120b-a12b

    Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
    MoE
    46.68M
    1mo