Excels in agentic coding and browser use and supports 256K context, delivering top results.

Excels in agentic coding and browser use and supports 256K context, delivering top results.
Qwen3-Coder-480B-A35B-Instruct is a state-of-the-art large language model specifically designed for code generation and agentic coding tasks. It is a mixture-of-experts (MoE) model with 480B total parameters and 35B activated parameters, featuring native support for 262,144 tokens context length and extendable up to 1M tokens using YaRN.
This model demonstrates significant performance among open models on Agentic Coding, Agentic Browser-Use, and other foundational coding tasks, achieving results comparable to Claude Sonnet. It supports function calling and tool choice capabilities, making it ideal for complex coding workflows and agentic applications.
This model is ready for commercial use.
GOVERNING TERMS: This trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Community Model License. Additional Information: Apache 2.0.
Deployment Geography: Global
Release Date: 08/22/2025
Build.NVIDIA.com: Available via link
This model is not owned or developed by NVIDIA. This model has been developed by Qwen (Alibaba Cloud). This model has been developed and built to a third-party's requirements for this application and use case; see link to Qwen3-Coder-480B-A35B-Instruct.
Architecture Type: mixture-of-experts (MoE) with Sparse Activation
Network Architecture: Qwen3MoeForCausalLM (Transformer-based decoder-only)
Parameter Count: 480B total parameters with 35B activated parameters
Expert Configuration: 160 experts with 8 activated per forward pass
Attention Mechanism: Grouped Query Attention (GQA) with 96 query heads and 8 KV heads
Number of Layers: 62
Hidden Size: 6144
Head Dimension: 128
Intermediate Size: 8192
MoE Intermediate Size: 2560
Context Length: 262,144 tokens (native), extendable to 1M with YaRN
Vocabulary Size: 151,936
Input Type(s): Text, Code, Function calls
Input Format(s): Natural language prompts, code snippets, structured function calls
Input Parameters:
Output Type(s): Text, Code, Function responses
Output Format(s): Natural language responses, code generation, structured function outputs
Output Parameters: One-Dimensional (1D)
Other Properties Related to Output:
<think></think> blocks)Runtime Engine: vLLM, Transformers (4.51.0+)
Supported Hardware Platform(s): NVIDIA Hopper
Supported Operating System(s): Linux
Data Type: FP8
Data Modality: Text
Model Version: v1.0
The model achieves significant performance among open models on:
Acceleration Engine: vLLM
Test Hardware: NVIDIA Hopper
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.
Please report security vulnerabilities or NVIDIA AI Concerns here.