Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters (1)

  • Free Endpoint
    9
  • Partner Endpoint
    7
  • Download Available
    8
  • Drug Discovery
    0
  • Image-to-Text
    0
  • Code Generation
    0
  • Retrieval Augmented Generation
    0
  • Speech-to-Text
    0
  • Deep Infra
    7
  • Together AI
    4
  • GMI Cloud
    4
  • Bitdeer AI
    3
  • Vultr
    3
  • DeepSeek AI
    2
  • Stepfun ai
    2
  • Mistral AI
    1
  • Qwen
    1
  • Google
    1
  • B200
    6
  • H200
    5
  • H100 80GB HBM3
    3
  • L40S
    1
  • A100 SXM4 80GB
    0
  • coding
  • 10 models
    Stepfun-ai
    DownloadableFree Endpoint

    step-3.7-flash

    A sparse MoE multimodal reasoning model good for enterprise, agentic and coding tasks.
    B200
    Items per page
    of 1 pages
    1.65M
    6d
    Mistral AI
    DownloadableFree Endpoint

    mistral-medium-3.5-128b

    A high performing model for text generation, coding and agentic use cases
    coding
    3.14M
    1mo
    DeepSeek AI
    DownloadableFree Endpoint

    deepseek-v4-flash

    DeepSeek V4 Flash is a 284B MoE model with 1M-token context optimized for fast coding and agents.
    B200
    13.22M
    1mo
    DeepSeek AI
    Downloadable

    deepseek-v4-pro

    DeepSeek V4 scales to 1M-token context windows with efficient MoE architecture for coding tasks.
    B200
    8.11M
    1mo
    Z.ai
    DownloadableFree Endpoint

    glm-5.1

    GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.
    B200
    25.15M
    1mo
    Minimaxai
    DownloadableFree Endpoint

    minimax-m2.7

    MiniMax M2.7 is a 230B-parameter text-to-text AI model excelling in coding, reasoning, and office tasks.
    B200
    13.49M
    1mo
    Google
    DownloadableFree Endpoint

    gemma-4-31b-it

    Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.
    B200
    5.64M
    2mo
    Stepfun-ai
    Free Endpoint

    step-3.5-flash

    200B open-source reasoning engine with sparse MoE powering frontier agentic AI.
    Agentic
    11.53M
    4mo
    Qwen
    Free Endpoint

    qwen3-coder-480b-a35b-instruct

    Excels in agentic coding and browser use and supports 256K context, delivering top results.
    agentic coding
    5.03M
    9mo
    Sarvamai
    DownloadableFree Endpoint

    sarvam-m

    Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.
    coding
    280K
    10mo