---
title: "qwen3.5-397b-a17b"
publisher: "qwen"
type: "endpoint"
updated: "2026-02-16T17:41:54.519Z"
description: "Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities."
canonical: "https://build.nvidia.com/qwen/qwen3.5-397b-a17b"
---

# Qwen3.5-397B-A17B

## Description
Qwen3.5-397B-A17B is a multimodal foundation model featuring a Hybrid Mixture-of-Experts architecture with early fusion vision-language training, designed for state-of-the-art performance across chat, retrieval-augmented generation, vision-language understanding, video understanding, and agentic workflows. Key highlights include:

- **Unified Vision-Language Foundation:** Early fusion training on multimodal tokens achieves cross-generational parity with prior text-only models and outperforms dedicated vision-language models across reasoning, coding, agents, and visual understanding benchmarks.
- **Efficient Hybrid Architecture:** Gated Delta Networks combined with sparse Mixture-of-Experts deliver high-throughput inference with minimal latency and cost overhead.
- **Scalable RL Generalization:** Reinforcement learning scaled across million-agent environments with progressively complex task distributions for robust real-world adaptability.
- **Global Linguistic Coverage:** Expanded support to 201 languages and dialects, enabling inclusive, worldwide deployment with nuanced cultural and regional understanding.
- **Thinking Mode:** Operates in thinking mode by default, generating internal reasoning content (`<think>...</think>`) before producing final responses, with an option to disable for direct responses.

*This model is ready for commercial/non-commercial use.*

## Third-Party Community Consideration:
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see link to Non-NVIDIA [Qwen3.5-397B-A17B Model Card](https://huggingface.co/Qwen/Qwen3.5-397B-A17B)

## License and Terms of Use:
**GOVERNING TERMS:** This trial service is governed by the [NVIDIA API Trial Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Use of this model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/). Additional Information: [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0.txt).

## Deployment Geography:
Global

## Use Case:
**Use Case:** Designed for developers and enterprises building multimodal AI applications including conversational chat, retrieval-augmented generation (RAG), vision-language understanding and reasoning, tool/function calling, and agentic workflows.

## Release Date:
**Build.NVIDIA.com:** 02/16/2026 via [link](https://build.nvidia.com/qwen/qwen3.5-397b-a17b)  
**Huggingface:** 02/16/2026 via [link](https://huggingface.co/Qwen/Qwen3.5-397B-A17B)

## Model Architecture:
**Architecture Type:** Transformer (Causal Language Model with Vision Encoder)  
**Network Architecture:** Qwen3.5 (Hybrid Mixture-of-Experts with Gated DeltaNet)  
**Total Parameters:** 397B  
**Active Parameters:** 17B  
**Vocabulary Size:** 248,320  
**Number of Layers:** 60  
**Hidden Dimension:** 4,096  
**Hidden Layout:** 15 × (3 × (Gated DeltaNet → MoE) → 1 × (Gated Attention → MoE))  
**Gated DeltaNet:** 64 linear attention heads for V, 16 for QK; head dimension 128  
**Gated Attention:** 32 heads for Q, 2 for KV; head dimension 256; RoPE dimension 64  
**Mixture-of-Experts:** 512 total experts; 10 routed + 1 shared activated per token; expert intermediate dimension 1,024  
**Multi-Token Prediction (MTP):** Trained with multi-steps

### Input:
**Input Types:** Text, Image, Video  
**Input Formats:** String, Red, Green, Blue (RGB), Video (MP4/WebM)  
**Input Parameters:** One-Dimensional (1D), Two-Dimensional (2D), Three-Dimensional (3D)  
**Other Input Properties:** Supports text, image, and video inputs via a unified multimodal architecture with early fusion training on multimodal tokens. Combines a Vision Transformer (ViT) encoder with a Hybrid MoE language model featuring Gated DeltaNet layers with global attention and fine-grained MoE routing (10 routed + 1 shared expert out of 512 total).  
**Input Context Length (ISL):** 262,144 tokens natively; extensible up to 1,010,000 tokens via YaRN RoPE scaling

### Output:
**Output Types:** Text  
**Output Format:** String  
**Output Parameters:** One-Dimensional (1D)  
**Other Output Properties:** Generates text responses based on multimodal inputs. Operates in thinking mode by default (internal reasoning via `<think>...</think>` tags before final response; can be disabled). Natively supports tool/function calling, agentic workflows (Qwen-Agent, MCP servers), and multi-turn conversations.  
**Output Context Length (OSL):** Recommended 32,768 tokens for most queries; up to 81,920 tokens for complex reasoning tasks

__Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.__

## Software Integration:
**Runtime Engines:**
- **SGLang:** Primary optimized inference engine (supports MTP, tool calling, reasoning parser)
- **vLLM:** Full support (supports MTP, tool calling, text-only mode, reasoning parser)
- **Hugging Face Transformers:** Lightweight serving support (continuous batching)

**Supported Hardware:**
- **NVIDIA Ampere:** A100
- **NVIDIA Blackwell:** B100, B200, GB200
- **NVIDIA Hopper:** H100, H200

**Preferred/Supported Operating Systems:** Linux

__The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.__

## Model Version(s)
Qwen3.5-397B-A17B v1.0

## Training, Testing, and Evaluation Datasets:

### Training Dataset
**Data Modality:** Image, Text, Video  
**Training Data Collection:** Undisclosed, Automated  
**Training Labeling:** Automated  
**Training Properties:** Trillions of multimodal tokens across image, text, and video domains, covering 201 languages and dialects. Specific dataset names and data licenses are not disclosed.

### Testing Dataset
**Testing Data Collection:** Undisclosed  
**Testing Labeling:** Undisclosed  
**Testing Properties:** Undisclosed

### Evaluation Dataset
**Evaluation Data Collection:** Public benchmark datasets as listed below  
**Evaluation Labeling:** Ground-truth labels from established benchmark suites  
**Evaluation Properties:** Evaluated in thinking mode with recommended sampling parameters (temperature=0.6, top_p=0.95, top_k=20)

<details>
<summary><b>Evaluation Benchmark Scores (Language)</b></summary>

<div style="font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:900px;margin:0 auto;padding:16px 0">
<table style="width:100%;border-collapse:collapse;font-size:13px">
<thead><tr>
<th style="padding:10px 12px;text-align:left;font-weight:600;border-bottom:2px solid #7c3aed;color:#7c3aed"></th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">GPT5.2</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Claude 4.5 Opus</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Gemini-3 Pro</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Qwen3-Max-Thinking</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">K2.5-1T-A32B</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Qwen3.5-397B-A17B</th>
</tr></thead>
<tbody>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Knowledge</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLU-Pro</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.8</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLU-Redux</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">95.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">95.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">95.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">SuperGPQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">74.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">69.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.4</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">C-Eval</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.0</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Instruction Following</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">IFEval</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">IFBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">58.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MultiChallenge</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">57.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">54.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">64.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">62.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.6</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Long Context</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">AA-LCR</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">74.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LongBench v2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">54.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">64.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">60.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">61.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.2</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">STEM</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">GPQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.4</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">HLE</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">35.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">30.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">37.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">30.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">30.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">28.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">HLE-Verified&#185;</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">43.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">38.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">48</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">37.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">37.6</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Reasoning</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LiveCodeBench v6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">HMMT Feb 25</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">99.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">97.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">98.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">95.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.8</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">HMMT Nov 25</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">100</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">IMOAnswerBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">AIME26</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">96.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.3</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">General Agent</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">BFCL-V4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">TAU2-Bench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">VITA-Bench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">38.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">56.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">51.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">40.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">41.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">49.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">DeepPlanning</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">44.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">33.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">23.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">28.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">14.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">34.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Tool Decathlon</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">43.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">43.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">36.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">18.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">27.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">38.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MCP-Mark</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">57.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">42.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">53.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">33.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">29.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">46.1</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Search Agent&#179;</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">HLE w/ tool</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">45.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">43.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">45.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">49.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">50.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">48.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">BrowseComp</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">59.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">53.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--/74.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">69.0/78.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">BrowseComp-zh</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">62.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">66.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">60.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">WideSearch</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">57.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">74.0</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Seal-0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">45.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">47.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">45.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">46.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">57.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">46.9</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Multilingualism</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMMLU</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLU-ProX</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">82.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">NOVA-63</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">54.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">56.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">56.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">54.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">56.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">59.1</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">INCLUDE</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">82.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Global PIQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.8</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">PolyMATH</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">62.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">64.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">43.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">WMT24++</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MAXIFE</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.2</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Coding Agent</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">SWE-bench Verified</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.4</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">SWE-bench Multilingual</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">66.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">69.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">SecCodeBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">62.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">57.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">61.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Terminal Bench 2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">54.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">59.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">54.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">22.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">50.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">52.5</td>
</tr>
</tbody>
</table>

<p style="margin-top:12px;font-size:11px;opacity:0.7">
* HLE-Verified: a verified and revised version of Humanity's Last Exam (HLE), accompanied by a transparent, component-wise verification protocol and a fine-grained error taxonomy. We open-source the dataset at https://huggingface.co/datasets/skylenage/HLE-Verified.<br>
* TAU2-Bench: we follow the official setup except for the airline domain, where all models are evaluated by applying the fixes proposed in the Claude Opus 4.5 system card.<br>
* MCPMark: GitHub MCP server uses v0.30.3 from api.githubcopilot.com; Playwright tool responses are truncated at 32k tokens.<br>
* Search Agent: most search agents built on our model adopt a simple context-folding strategy(256k): once the cumulative Tool Response length reaches a preset threshold, earlier Tool Responses are pruned from the history to keep the context within limits.<br>
* BrowseComp: we tested two strategies, simple context-folding achieved a score of 69.0, while using the same discard-all strategy as DeepSeek-V3.2 and Kimi K2.5 achieved 78.6.<br>
* WideSearch: we use a 256k context window without any context management.<br>
* MMLU-ProX: we report the averaged accuracy on 29 languages.<br>
* WMT24++: a harder subset of WMT24 after difficulty labeling and rebalancing; we report the averaged scores on 55 languages using XCOMET-XXL.<br>
* MAXIFE: we report the accuracy on English + multilingual original prompts (totally 23 settings).<br>
* Empty cells (--) indicate scores not yet available or not applicable.<br>
</p>

</div>

</details>

<details>
<summary><b>Evaluation Benchmark Scores (Vision-Language)</b></summary>

<div style="font-family:-apple-system,BlinkMacSystemFont,'Segoe UI',Roboto,sans-serif;max-width:900px;margin:0 auto;padding:16px 0">
<table style="width:100%;border-collapse:collapse;font-size:13px">
<thead><tr>
<th style="padding:10px 12px;text-align:left;font-weight:600;border-bottom:2px solid #7c3aed;color:#7c3aed"></th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">GPT5.2</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Claude 4.5 Opus</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Gemini-3 Pro</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Qwen3-VL-235B-A22B</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">K2.5-1T-A32B</th>
<th style="padding:10px 12px;text-align:center;font-weight:500;border-bottom:2px solid #7c3aed;color:#7c3aed;font-size: 14px;">Qwen3.5-397B-A17B</th>
</tr></thead>
<tbody>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">STEM and Puzzle</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMMU</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.0</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMMU-Pro</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">69.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.0</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MathVision</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">74.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">74.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Mathvista(mini)</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">We-Math</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">74.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">DynaMath</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">82.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">ZEROBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">10</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">12</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">ZEROBench_sub</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">33.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">28.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">39.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">28.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">33.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">41.0</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">BabyVision</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">34.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">14.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">49.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">22.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">36.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">52.3/43.3</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">General VQA</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">RealWorldQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMStar</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.8</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">HallusionBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">64.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">66.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">69.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">71.4</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMBench<sub><small>EN-DEV-v1.1</small></sub></td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">SimpleVQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">55.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">61.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">71.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.1</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Text Recognition and Document Understanding</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">OmniDocBench1.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.8</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">CharXiv(RQ)</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">82.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">66.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.8</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMLongBench-Doc</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">61.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">60.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">56.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">58.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">61.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">CC-OCR</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">82.0</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">AI2D_TEST</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">89.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">OCRBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.1</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Spatial Intelligence</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">ERQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">59.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">46.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">52.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">CountBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">90.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">97.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">93.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">94.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">97.2</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">RefCOCO(avg)</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">91.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">92.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">ODInW13</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">46.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">43.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">47.0</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">EmbSpatialBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">61.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">RefSpatialBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">69.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LingoQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">66.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">68.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">V*</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">95.8/91.1</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Hypersim</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">11.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">12.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">SUNRGBD</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">34.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">38.3</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">Nuscene</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">13.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">16.0</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Video Understanding</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">VideoMME<sub><small>(w sub.)</small></sub></td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">88.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">VideoMME<sub><small>(w/o sub.)</small></sub></td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">VideoMMMU</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">87.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">84.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MLVU (M-Avg)</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">83.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">85.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">86.7</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MVBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">78.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">67.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">74.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">LVBench</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">57.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.5</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MMVU</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.8</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">77.5</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">71.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">80.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">75.4</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Visual Agent</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">ScreenSpot Pro</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">45.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">72.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">62.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.6</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">OSWorld-Verified</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">38.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">66.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">38.1</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">62.2</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">AndroidWorld</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">--</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">66.8</td>
</tr>
<tr><td colspan="7" style="padding:8px 12px;font-weight:600;color:#7c3aed;border-bottom:1px solid rgba(124, 58, 237, 0.2);background:rgba(124, 58, 237, 0.1)">Medical VQA</td></tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">SLAKE</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.4</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">54.7</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">81.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">79.9</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">PMC-VQA</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">58.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">59.9</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">62.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">41.2</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">64.2</td>
</tr>
<tr>
<td style="padding:7px 12px;padding-left:20px;border-bottom:1px solid rgba(128, 128, 128, 0.15);">MedXpertQA-MM</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">73.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">63.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">76.0</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">47.6</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">65.3</td>
<td style="padding:7px 12px;text-align:center;border-bottom:1px solid rgba(128, 128, 128, 0.15)">70.0</td>
</tr>
</tbody>
</table>

<p style="margin-top:12px;font-size:11px;opacity:0.7">
* MathVision: our model's score is evaluated using a fixed prompt, e.g., "Please reason step by step, and put your final answer within \boxed{}." For other models, we report the higher score between runs with and without the \boxed{} formatting.<br>
* BabyVision: our model's score is reported with CI (Code Interpreter) enabled; without CI, the result is 43.3.<br>
* V*: our model's score is reported with CI (Code Interpreter) enabled; without CI, the result is 91.1.<br>
* Empty cells (--) indicate scores not yet available or not applicable.<br>
</p>

</div>

</details>

## Best Practices

To achieve optimal performance, Qwen recommends the following settings:

1. **Sampling Parameters**:
- Qwen suggests using `Temperature=0.6`, `TopP=0.95`, `TopK=20`, and `MinP=0` for thinking mode and using `Temperature=0.7`, `TopP=0.8`, `TopK=20`, and `MinP=0` for non-thinking mode.
- For supported frameworks, you can adjust the `presence_penalty` parameter between 0 and 2 to reduce endless repetitions. However, using a higher value may occasionally result in language mixing and a slight decrease in model performance.

2. **Adequate Output Length**: Qwen recommends using an output length of 32,768 tokens for most queries. For benchmarking on highly complex problems, such as those found in math and programming competitions, Qwen suggests setting the max output length to 81,920 tokens. This provides the model with sufficient space to generate detailed and comprehensive responses, thereby enhancing its overall performance.

3. **Standardize Output Format**: Qwen recommends using prompts to standardize model outputs when benchmarking.
- **Math Problems**: Include "Please reason step by step, and put your final answer within \boxed{}." in the prompt.
- **Multiple-Choice Questions**: Add the following JSON structure to the prompt to standardize responses: "Please show your choice in the `answer` field with only the choice letter, e.g., `"answer": "C"`."

4. **No Thinking Content in History**: In multi-turn conversations, the historical model output should only include the final output part and does not need to include the thinking content. It is implemented in the provided chat template in Jinja2. However, for frameworks that do not directly use the Jinja2 chat template, it is up to the developers to ensure that the best practice is followed.

5. **Long Video Understanding**: To optimize inference efficiency for plain text and images, the `size` parameter in the released `video_preprocessor_config.json` is conservatively configured. It is recommended to set the `longest_edge` parameter in the video_preprocessor_config file to 469,762,048 (corresponding to 224k video tokens) to enable higher frame-rate sampling for hour-scale videos and thereby achieve superior performance. For example,
```json
{"longest_edge": 469762048, "shortest_edge": 4096}
```

Alternatively, override the default values via engine startup parameters. For implementation details, refer to: [vLLM](https://github.com/vllm-project/vllm/pull/34330) / [SGLang](https://github.com/sgl-project/sglang/pull/18467).

## Inference
**Acceleration Engine:** SGLang  
**Test Hardware:** NVIDIA B200 GPU  
**Recommended Serving Configuration:** TP=8, context length 262,144, reasoning parser qwen3; supports Multi-Token Prediction (MTP) for improved throughput

## Ethical Considerations
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please make sure you have proper rights and permissions for all input image and video content; if image or video includes people, personal health information, or intellectual property, the image or video generated will not blur or maintain proportions of image subjects included.

Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

## Prototype

```python
import requests

invoke_url = "https://integrate.api.nvidia.com/v1/chat/completions"

headers = {
"Authorization": "Bearer ",
"Accept": "application/json",
}

payload = {
"messages": [
{
"role": "user",
"content": ""
}
]
}

# re-use connections
session = requests.Session()

response = session.post(invoke_url, headers=headers, json=payload)

response.raise_for_status()
response_body = response.json()
print(response_body)
```

```python
from langchain_nvidia_ai_endpoints import ChatNVIDIA

client = ChatNVIDIA(
model="",
api_key="$NVIDIA_API_KEY",
temperature=,
top_p=,
max_completion_tokens=,
)

lc_messages = [{"role":"user","content":""}]

response = client.invoke(lc_messages)
if response.additional_kwargs and "reasoning_content" in response.additional_kwargs:
print(response.additional_kwargs["reasoning_content"])
print(response.content)
```

```javascript
import fetch from "node-fetch";

const invokeUrl = "https://integrate.api.nvidia.com/v1/chat/completions"

const headers = {
"Authorization": "Bearer ",
"Accept": "application/json",
}

const payload = {
"messages": [
{
"role": "user",
"content": ""
}
]
}

let response = await fetch(invokeUrl, {
method: "post",
body: JSON.stringify(payload),
headers: { "Content-Type": "application/json", ...headers }
});

let response_body = await response.json()

console.log(JSON.stringify(response_body))
```

```bash
invoke_url='https://integrate.api.nvidia.com/v1/chat/completions'

authorization_header='Authorization: Bearer '
accept_header='Accept: application/json'
content_type_header='Content-Type: application/json'

data=$'{
"messages": [
{
"role": "user",
"content": ""
}
]
}'

response=$(curl --silent -i -w "\n%{http_code}" --request POST \
--url "$invoke_url" \
--header "$authorization_header" \
--header "$accept_header" \
--header "$content_type_header" \
--data "$data"
)

echo "$response"
```