NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

10 results for

Filters

  • API Endpoint
    6
  • Download Available
    4
  • Retrieval Augmented Generation
    2
  • Code Generation
    1
  • Text-to-Embedding
    1
  • NVIDIA
    3
  • Moonshotai
    2
  • Qwen
    2
  • ByteDance
    1
  • DeepSeek AI
    1
  • DeepSeek AI

    deepseek-v3.2

    State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.
    Model
    long context
    15.64M
    2mo
    Moonshotai

    kimi-k2-instruct-0905

    Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities.
    Model
    long-context
    10.04M
    5mo
    Moonshotai

    kimi-k2-thinking

    Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use.
    Model
    Conversational
    3.22M
    3mo
    Qwen

    qwen3-coder-480b-a35b-instruct

    Excels in agentic coding and browser use and supports 256K context, delivering top results.
    Model
    agentic coding
    3.83M
    6mo
    NVIDIA

    nemotron-3-nano-30b-a3b

    Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more
    Model
    MoE
    12.32M
    2mo
    NVIDIA

    llama-3.2-nv-rerankqa-1b-v2

    Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
    Model
    nemo retriever
    151K
    7mo
    Microsoft

    phi-3-small-128k-instruct

    Long context cutting-edge lightweight open language model exceling in high-quality reasoning.
    Model
    chat
    590K
    9mo
    Qwen

    qwen3-next-80b-a3b-instruct

    Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.
    Model
    chat
    11.15M
    5mo
    ByteDance

    seed-oss-36b-instruct

    ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.
    Model
    thinking budget
    3.46M
    6mo
    NVIDIA

    llama-3.2-nv-embedqa-1b-v2

    Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
    Model
    nemo retriever
    6.63M
    7mo
    Items per page
    of 1 pages