NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

qwen

qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

qwen

qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

agentic codingbrowser uselong contextmoe

Qwen3-Coder-480B-A35B-Instruct

Model Overview

Description:

Qwen3-Coder-480B-A35B-Instruct is a state-of-the-art large language model specifically designed for code generation and agentic coding tasks. It is a mixture-of-experts (MoE) model with 480B total parameters and 35B activated parameters, featuring native support for 262,144 tokens context length and extendable up to 1M tokens using YaRN.

This model demonstrates significant performance among open models on Agentic Coding, Agentic Browser-Use, and other foundational coding tasks, achieving results comparable to Claude Sonnet. It supports function calling and tool choice capabilities, making it ideal for complex coding workflows and agentic applications.

This model is ready for commercial use.

License/Terms of Use

GOVERNING TERMS: This trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Community Model License. Additional Information: Apache 2.0.

Deployment Geography

Deployment Geography: Global

Use Cases

  • Code Generation: Generate high-quality code from natural language descriptions
  • Agentic Coding: Execute complex coding workflows with function calling
  • Repository Understanding: Process large codebases with long-context capabilities
  • Tool Integration: Interface with development tools and APIs
  • Code Review and Analysis: Analyze and improve existing code
  • Documentation Generation: Create code documentation and comments
  • Browser Automation: Agentic browser-use scenarios
  • Function Calling: Structured tool execution and API integration

Release Information

Release Date: 08/22/2025
Build.NVIDIA.com: Available via link

Third-Party Community Consideration

This model is not owned or developed by NVIDIA. This model has been developed by Qwen (Alibaba Cloud). This model has been developed and built to a third-party's requirements for this application and use case; see link to Qwen3-Coder-480B-A35B-Instruct.

References

  • Qwen3-Coder: A Large Language Model for Code Generation
  • Qwen3-Coder GitHub Repository
  • Qwen Documentation
  • Hugging Face Model Page
  • Qwen3 Technical Report (arXiv:2505.09388)

Model Architecture

Architecture Type: mixture-of-experts (MoE) with Sparse Activation
Network Architecture: Qwen3MoeForCausalLM (Transformer-based decoder-only)
Parameter Count: 480B total parameters with 35B activated parameters
Expert Configuration: 160 experts with 8 activated per forward pass
Attention Mechanism: Grouped Query Attention (GQA) with 96 query heads and 8 KV heads
Number of Layers: 62
Hidden Size: 6144
Head Dimension: 128
Intermediate Size: 8192
MoE Intermediate Size: 2560
Context Length: 262,144 tokens (native), extendable to 1M with YaRN
Vocabulary Size: 151,936

Input

Input Type(s): Text, Code, Function calls
Input Format(s): Natural language prompts, code snippets, structured function calls
Input Parameters:

  • Max input length: 262,144 tokens (native), up to 1M with YaRN
  • Support for function calling format
  • Tool choice enabled
  • Trust remote code execution
  • Custom tool call parser (qwen3_coder)

Output

Output Type(s): Text, Code, Function responses
Output Format(s): Natural language responses, code generation, structured function outputs
Output Parameters: One-Dimensional (1D)

  • Max output length: Configurable based on remaining context
  • Function call responses in structured format

Other Properties Related to Output:

  • Non-thinking mode (no <think></think> blocks)
  • Auto tool choice responses

Software Integration

Runtime Engine: vLLM, Transformers (4.51.0+)
Supported Hardware Platform(s): NVIDIA Hopper
Supported Operating System(s): Linux
Data Type: FP8
Data Modality: Text
Model Version: v1.0

Training, Testing, and Evaluation Datasets

Training Dataset

  • Data Collection Method by dataset: The model was trained on a diverse dataset including code repositories, documentation, and natural language text related to programming
  • Labeling Method by dataset: Supervised fine-tuning with instruction-following data
  • Properties: Multi-language code support, instruction-following capabilities, function calling training

Testing Dataset

  • Data Collection Method by dataset: Standard benchmarks for code generation and agentic tasks
  • Labeling Method by dataset: Automated evaluation metrics
  • Properties: HumanEval, MBPP, Agentic coding benchmarks

Evaluation Dataset

  • Data Collection Method by dataset: Public benchmarks and custom evaluation sets
  • Labeling Method by dataset: Automated metrics and human evaluation
  • Properties: Code generation quality, function calling accuracy, agentic task performance

Benchmark Results

The model achieves significant performance among open models on:

  • Agentic Coding tasks
  • Agentic Browser-Use scenarios
  • Foundational coding benchmarks
  • Results comparable to Claude Sonnet on various coding tasks

Inference

Acceleration Engine: vLLM
Test Hardware: NVIDIA Hopper

Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report security vulnerabilities or NVIDIA AI Concerns here.