NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters

  • Free Endpoint
    7
  • Partner Endpoint
    16
  • Download Available
    13
  • Image-to-Text
    4
  • Code Generation
    2
  • Drug Discovery
    0
  • Retrieval Augmented Generation
    0
  • Object Detection
    0
  • Fireworks AI
    13
  • Deep Infra
    11
  • Together AI
    9
  • GMI Cloud
    9
  • Bitdeer AI
    8
  • Qwen
    5
  • Mistral AI
    4
  • NVIDIA
    2
  • Meta
    2
  • OpenAI
    2
  • Enterprise
    0
  • NVIDIA BioNemo
    0
  • 19 models
    Mistral AI
    Downloadable

    mistral-small-4-119b-2603

    Hybrid MoE model unifying instruct, reasoning, and coding with multimodal input and 256k context
    chat
    67.74K
    1w
    NVIDIA
    Downloadable

    nemotron-3-super-120b-a12b

    Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
    chat
    4.86M
    1w
    Qwen
    Free Endpoint

    qwen3.5-122b-a10b

    122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
    chat
    2.06M
    2w
    Qwen
    Downloadable

    qwen3.5-397b-a17b

    Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
    chat
    9.97M
    1mo
    Z.ai
    Downloadable

    glm-5

    GLM-5 744B MoE enables efficient reasoning for complex systems and long-horizon agentic tasks.
    MoE
    11.31M
    1mo
    Stepfun-ai
    Free Endpoint

    step-3.5-flash

    200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
    chat
    8.02M
    1mo
    Moonshotai
    Downloadable

    kimi-k2.5

    1T multimodal MoE for high‑capacity video and image understanding with efficient inference.
    Multimodal
    21.24M
    1mo
    NVIDIA
    Downloadable

    nemotron-3-nano-30b-a3b

    Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
    chat
    12.42M
    3mo
    Mistral AI
    Free Endpoint

    mistral-large-3-675b-instruct-2512

    A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.
    chat
    6.62M
    3mo
    Qwen
    Downloadable

    qwen3-next-80b-a3b-instruct

    Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
    chat
    13.11M
    6mo
    Qwen
    Downloadable

    qwen3-next-80b-a3b-thinking

    80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.
    chat
    4.5M
    6mo
    Qwen
    Free Endpoint

    qwen3-coder-480b-a35b-instruct

    Excels in agentic coding and browser use and supports 256K context, delivering top results.
    agentic coding
    3.67M
    6mo
    OpenAI
    Downloadable

    gpt-oss-20b

    Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
    reasoning
    8.4M
    7mo
    OpenAI
    Downloadable

    gpt-oss-120b

    Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
    reasoning
    37.86M
    7mo
    Meta
    Free Endpoint

    llama-4-maverick-17b-128e-instruct

    A general purpose multimodal, multilingual 128 MoE model with 17B parameters.
    chat
    3.66M
    8mo
    Meta
    DownloadableFree Endpoint

    llama-4-scout-17b-16e-instruct

    A multimodal, multilingual 16 MoE model with 17B parameters.
    language generation
    64.89K
    8mo
    AI21 Labs
    Free Endpoint

    jamba-1.5-mini-instruct

    Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.
    chat
    573K
    10mo
    Mistral AI
    Downloadable

    mixtral-8x22b-instruct-v0.1

    An MOE LLM that follows instructions, completes requests, and generates creative text.
    chat
    5.02M
    8mo
    Mistral AI
    Downloadable

    mixtral-8x7b-instruct-v0.1

    An MOE LLM that follows instructions, completes requests, and generates creative text.
    chat
    732K
    8mo
    Items per page
    of 1 pages