NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters (1)

  • Free Endpoint
    2
  • Partner Endpoint
    1
  • Download Available
    1
  • Code Generation
    0
  • Retrieval Augmented Generation
    0
  • Drug Discovery
    0
  • Image-to-Text
    0
  • Object Detection
    0
  • Deep Infra
    1
  • Together AI
    1
  • GMI Cloud
    1
  • Bitdeer AI
    1
  • CoreWeave
    0
  • Microsoft
    1
  • Qwen
    1
  • ByteDance
    1
  • NVIDIA
    0
  • Meta
    0
  • Enterprise
    0
  • NVIDIA BioNemo
    0
  • text-generation
  • 3 models
    Qwen
    Downloadable

    qwen3-next-80b-a3b-instruct

    Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
    text-generation
    22.35M
    6mo
    ByteDance
    Free Endpoint

    seed-oss-36b-instruct

    ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
    thinking budget
    1.33M
    7mo
    Microsoft
    Deprecation in 1dFree Endpoint

    phi-4-mini-flash-reasoning

    Lightweight reasoning model for applications in latency bound, memory/compute constrained environments
    edge
    158K
    8mo
    Items per page
    of 1 pages