Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIALaunch from Hugging FaceBeta

Filters

  • Free Endpoint
    21
  • Partner Endpoint
    14
  • Download Available
    15
  • Image-to-Text
    4
  • Code Generation
    4
  • Synthetic Data Generation
    1
  • Digital Twin
    1
  • Drug Discovery
    0
  • Deep Infra
    10
  • Together AI
    5
  • GMI Cloud
    5
  • Bitdeer AI
    4
  • CoreWeave
    4
  • NVIDIA
    11
  • Mistral AI
    3
  • Meta
    2
  • Qwen
    2
  • OpenAI
    2
  • B200
    4
  • L40S
    3
  • H200
    3
  • H100 80GB HBM3
    2
  • A100 SXM4 80GB
    2
  • 24 models
    NVIDIA
    Free Endpoint

    nemotron-3.5-content-safety

    Multilingual, multimodal model for detecting unsafe and toxic content.
    llm safety
    Items per page
    of 1 pages
    4.84K
    3d
    NVIDIA
    DownloadableFree Endpoint

    nemotron-3-nano-omni-30b-a3b-reasoning

    Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.
    Image-to-Text
    8.65M
    1mo
    Z.ai
    DownloadableFree Endpoint

    glm-5.1

    GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.
    B200
    25.39M
    1mo
    NVIDIA
    Free Endpoint

    nemotron-3-content-safety

    Multilingual, multimodal model for detecting unsafe and toxic content.
    llm safety
    220K
    1mo
    NVIDIA
    DownloadableFree Endpoint

    ising-calibration-1-35b-a3b

    Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.
    Quantum
    330K
    1mo
    Qwen
    DownloadableFree Endpoint

    qwen3.5-122b-a10b

    122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
    B200
    9.26M
    3mo
    Qwen
    DownloadableFree Endpoint

    qwen3.5-397b-a17b

    Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
    MoE
    11.21M
    3mo
    Mistral AI
    Free Endpoint

    mistral-large-3-675b-instruct-2512

    A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.
    language generation
    3.19M
    6mo
    Mistral AI
    DownloadableFree Endpoint

    ministral-14b-instruct-2512

    A general purpose VLM ideal for chat and instruction based use cases
    language generation
    3.34M
    6mo
    NVIDIA
    Free Endpoint

    llama-3.1-nemotron-safety-guard-8b-v3

    Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs
    content moderation
    351K
    7mo
    ByteDance
    Free Endpoint

    seed-oss-36b-instruct

    ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
    thinking budget
    1.14M
    9mo
    NVIDIA
    DownloadableFree Endpoint

    nvidia-nemotron-nano-9b-v2

    High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.
    thinking budget
    983K
    9mo
    OpenAI
    DownloadableFree Endpoint

    gpt-oss-20b

    Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
    reasoning
    19.51M
    10mo
    OpenAI
    DownloadableFree Endpoint

    gpt-oss-120b

    Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
    reasoning
    48.19M
    10mo
    Meta
    Free Endpoint

    llama-guard-4-12b

    Multi-modal model to classify safety for input prompts as well output responses.
    LLM Multimodal Safety
    186K
    11mo
    Microsoft
    DownloadableFree Endpoint

    phi-4-mini-instruct

    Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
    Chat
    450K
    1y
    NVIDIA
    Downloadable

    llama-3.1-nemoguard-8b-topic-control

    Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.
    nemo guardrails
    135K
    1y
    NVIDIA
    Downloadable

    nemoguard-jailbreak-detect

    Industry leading jailbreak classification model for protection from adversarial attempts
    nemo guardrails
    11.93K
    11mo
    NVIDIA
    Downloadable

    llama-3.1-nemoguard-8b-content-safety

    Leading content safety model for enhancing the safety and moderation capabilities of LLMs
    nemo guardrails
    125K
    1y
    NVIDIA
    Free Endpoint

    usdcode

    State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.
    Digital Twin
    11mo
    Meta
    DownloadableFree Endpoint

    llama-3.3-70b-instruct

    Advanced LLM for reasoning, math, general knowledge, and function calling
    B200
    15.16M
    11mo
    NVIDIA
    Free Endpoint

    nemotron-mini-4b-instruct

    Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling
    Chat
    1.62M
    1y
    Google
    Free Endpoint

    paligemma

    Vision language model adept at comprehending text and visual inputs to produce informative responses
    image
    10.77K
    1y
    Mistral AI
    DownloadableFree Endpoint

    mixtral-8x7b-instruct-v0.1

    An MOE LLM that follows instructions, completes requests, and generates creative text.
    B200
    767K
    10mo