NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters (1)

  • API Endpoint
    5
  • Download Available
    1
  • Code Generation
    2
  • Text Translation
    1
  • Drug Discovery
    0
  • Image-to-Text
    0
  • Retrieval Augmented Generation
    0
  • Qwen
    2
  • NVIDIA
    1
  • Microsoft
    1
  • ByteDance
    1
  • THUDM
    1
  • chat
  • 6 models
    Qwen
    qwen3-next-80b-a3b-instruct
    Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
    chat
    5mo
    ByteDance
    seed-oss-36b-instruct
    ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
    thinking budget
    5mo
    Qwen
    qwen2.5-coder-7b-instruct
    Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.
    code completion
    9mo
    NVIDIA
    mistral-nemo-minitron-8b-base
    State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.
    language generation
    1y
    THUDM
    chatglm3-6b
    Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.
    Text Translation
    7mo
    Microsoft
    phi-3-small-128k-instruct
    Long context cutting-edge lightweight open language model exceling in high-quality reasoning.
    chat
    9mo
    Items per page
    of 1 pages