NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

11 results for

Filters

  • Free Endpoint
    6
  • Partner Endpoint
    8
  • Download Available
    5
  • Retrieval Augmented Generation
    2
  • Code Generation
    1
  • Text-to-Embedding
    1
  • Deep Infra
    6
  • Fireworks AI
    6
  • GMI Cloud
    4
  • Bitdeer AI
    3
  • Together AI
    3
  • NVIDIA
    4
  • Moonshotai
    2
  • Qwen
    2
  • ByteDance
    1
  • DeepSeek AI
    1
  • Moonshotai
    Free Endpoint

    kimi-k2-instruct-0905

    Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
    Model
    long-context
    9.06M
    5mo
    Moonshotai
    Free Endpoint

    kimi-k2-thinking

    Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use.
    Model
    Conversational
    2.88M
    3mo
    Qwen
    Free Endpoint

    qwen3-coder-480b-a35b-instruct

    Excels in agentic coding and browser use and supports 256K context, delivering top results.
    Model
    agentic coding
    3.61M
    6mo
    NVIDIA
    Downloadable

    nemotron-3-nano-30b-a3b

    Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
    Model
    chat
    11.55M
    3mo
    NVIDIA
    Downloadable

    nemotron-3-super-120b-a12b

    Open, efficient hybrid Mamba-Transformer MoE with 1M context, excelling in agentic reasoning, coding, planning, tool calling, and more
    Model
    chat
    1.68M
    5d
    DeepSeek AI
    Free Endpoint

    deepseek-v3.2

    State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.
    Model
    chat
    15.35M
    3mo
    NVIDIA
    Downloadable

    llama-3.2-nv-rerankqa-1b-v2

    Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
    Model
    nemo retriever
    146K
    7mo
    Microsoft
    Free Endpoint

    phi-3-small-128k-instruct

    Long context cutting-edge lightweight open language model exceling in high-quality reasoning.
    Model
    chat
    643K
    9mo
    Qwen
    Downloadable

    qwen3-next-80b-a3b-instruct

    Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
    Model
    chat
    11.85M
    5mo
    ByteDance
    Free Endpoint

    seed-oss-36b-instruct

    ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
    Model
    chat
    3.75M
    6mo
    NVIDIA
    Downloadable

    llama-3.2-nv-embedqa-1b-v2

    Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
    Model
    nemo retriever
    6.2M
    7mo
    Items per page
    of 1 pages