Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following
Qwen3-235B-A22B is the latest generation of large language models in the Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support. It uniquely supports seamless switching between thinking mode (for complex logical reasoning, math, and coding) and non-thinking mode (for efficient, general-purpose dialogue) within a single model, ensuring optimal performance across various scenarios. The model shows significant enhancement in its reasoning capabilities, surpassing previous versions on mathematics, code generation, and commonsense logical reasoning. It also demonstrates superior human preference alignment, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience. Furthermore, it has expertise in agent capabilities, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks. Qwen3-235B-A22B supports over 100 languages and dialects with strong capabilities for multilingual instruction following and translation.
This model is ready for commercial/non-commercial use.
This model is not owned or developed by NVIDIA. This model has been developed by Qwen (Alibaba Cloud). This model has been developed and built to a third-party's requirements for this application and use case; see link to Non-NVIDIA Qwen3-235B-A22B Model Card.
GOVERNING TERMS: This trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Community Model License. Additional Information: Apache 2.0.
Global
This model is expected to be used for a wide range of tasks including:
Architecture Type: Causal Language Model, Mixture-of-Experts (MoE)
Network Architecture: Qwen3
Input Type(s): Text
Input Format(s): String
Input Parameters: One-Dimensional (1D) for text.
Other Properties Related to Input:
enable_thinking=True
) or "non-thinking mode" (enable_thinking=False
). The tokenizer by default has enable_thinking=True
.Output Type(s): Text
Output Format: String
Output Parameters: One-Dimensional (1D) for text.
Other Properties Related to Output:
enable_thinking=True
.Runtime Engine(s):
Supported Hardware Microarchitecture Compatibility:
[Preferred/Supported] Operating System(s):
Qwen3-235B-A22B v1.0
Data Collection Method by dataset: Undisclosed
Labeling Method by dataset: Undisclosed
Properties: Undisclosed
Data Collection Method by dataset: Undisclosed
Labeling Method by dataset: Undisclosed
Properties: Undisclosed
Data Collection Method by dataset: Undisclosed
Labeling Method by dataset: Undisclosed
Properties: Undisclosed
For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to Qwen3 blog, GitHub, and Documentation.
Engine: SGLang
Test Hardware:
enable_thinking=True
) for complex tasks, which enhances response quality through internal reasoning steps. It can be switched to a "non-thinking mode" (enable_thinking=False
) for general dialogue.enable_thinking=True
): Temperature=0.6, TopP=0.95, TopK=20, MinP=0. Greedy decoding is not recommended.enable_thinking=False
): Temperature=0.7, TopP=0.8, TopK=20, MinP=0.presence_penalty
can be adjusted (0 to 2) to reduce repetition.NVIDIA believes Trustworthy Al is a shared responsibility and we have established policies and practices to enable development for a wide array of Al applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.