Skip to main content
NVIDIA
Explore
Models
Skills
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

18 results for

Filters (1)

  • Free Endpoint
    18
  • Partner Endpoint
    11
  • Download Available
    15
  • Image-to-Text
    4
  • Code Generation
    3
  • Retrieval Augmented Generation
    0
  • Text-to-Embedding
    0
  • Deepinfra
    9
  • Together AI
    7
  • GMI Cloud
    5
  • Bitdeer
    4
  • CoreWeave
    4
  • NVIDIA
    6
  • Mistral AI
    3
  • OpenAI
    2
  • Qwen
    2
  • Google
    1
  • B200
    4
  • H200
    3
  • L40S
    3
  • A100 PG509 200
    2
  • A100 SXM4 80GB
    2
  • Chat
  • Qwen
    DownloadableFree Endpoint

    qwen3.5-397b-a17b

    Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.
    Model
    MoE
    Items per page
    of 1 pages
    13.15M
    4mo
    NVIDIA
    DownloadableFree Endpoint

    nemotron-3-nano-omni-30b-a3b-reasoning

    Nemotron 3 Nano Omni is an omni-modal reasoning model that understands images, video, speech, text.
    Model
    Image-to-Text
    7.54M
    1mo
    Z.ai
    DownloadableFree Endpoint

    glm-5.1

    GLM-5.1 is a flagship LLM for agentic workflows, coding, and long-horizon reasoning tasks.
    Model
    Agentic AI
    27.59M
    2mo
    NVIDIA
    DownloadableFree Endpoint

    nemotron-nano-12b-v2-vl

    Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.
    Model
    language generation
    2.47M
    7mo
    NVIDIA
    DownloadableFree Endpoint

    llama-3.1-nemotron-nano-vl-8b-v1

    Multi-modal vision-language model that understands text/img and creates informative responses
    Model
    doc intelligence
    10.15M
    11mo
    Mistral AI
    DownloadableFree Endpoint

    ministral-14b-instruct-2512

    A general purpose VLM ideal for chat and instruction based use cases
    Model
    language generation
    3.62M
    6mo
    Google
    DownloadableFree Endpoint

    diffusiongemma-26b-a4b-it

    Diffusion-based 26B parameter LLM enabling parallel token generation for real-time text apps
    Model
    diffusion-llm
    524K
    8d
    NVIDIA
    DownloadableFree Endpoint

    ising-calibration-1-35b-a3b

    Open VLM for quantum computer calibration chart understanding across a range of qubit modalities.
    Model
    Quantum
    332K
    2mo
    Mistral AI
    Free Endpoint

    mistral-large-3-675b-instruct-2512

    A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.
    Model
    language generation
    3.22M
    6mo
    OpenAI
    DownloadableFree Endpoint

    gpt-oss-120b

    Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.
    Model
    reasoning
    57.12M
    10mo
    OpenAI
    DownloadableFree Endpoint

    gpt-oss-20b

    Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math
    Model
    reasoning
    18.45M
    10mo
    Meta
    DownloadableFree Endpoint

    llama-3.3-70b-instruct

    Advanced LLM for reasoning, math, general knowledge, and function calling
    Model
    Instruction following
    18.79M
    1y
    Mistral AI
    DownloadableFree Endpoint

    mixtral-8x7b-instruct-v0.1

    An MOE LLM that follows instructions, completes requests, and generates creative text.
    Model
    Advanced Reasoning
    996K
    11mo
    NVIDIA
    Free Endpoint

    nemotron-mini-4b-instruct

    Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling
    Model
    Chat
    1.53M
    1y
    NVIDIA
    DownloadableFree Endpoint

    nvidia-nemotron-nano-9b-v2

    High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.
    Model
    thinking budget
    988K
    10mo
    Microsoft
    DownloadableFree Endpoint

    phi-4-mini-instruct

    Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
    Model
    Chat
    445K
    1y
    Qwen
    DownloadableFree Endpoint

    qwen3.5-122b-a10b

    122B MoE LLM (10B active) for coding, reasoning, multimodal chat. Agent-ready.
    Model
    tool calling
    10.33M
    3mo
    ByteDance
    Free Endpoint

    seed-oss-36b-instruct

    ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
    Model
    thinking budget
    1.18M
    9mo