
Qwen-Image-Edit is an image editing model with multilingual text editing and strong subject consistency.
Qwen-Image-Edit is the image editing version of Qwen-Image. Built upon the 20B Qwen-Image model, Qwen-Image-Edit successfully extends Qwen-Image’s unique text rendering capabilities to image editing tasks, enabling precise text editing. Furthermore, Qwen-Image-Edit simultaneously feeds the input image into Qwen2.5-VL (for visual semantic control) and the VAE Encoder (for visual appearance control), achieving capabilities in both semantic and appearance editing.
Qwen-Image-Edit was developed by the Qwen Team.
This model is ready for commercial/non-commercial use.
These models are not owned or developed by NVIDIA. These models have been developed and built to a third-party’s requirements for this application and use case; see links to:
GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service; and use of this model is governed by the NVIDIA Open Model License. Additional Information: Apache 2.0 license.
Global
Qwen-Image adopts a three-core module architecture:
| Component | Parameter Count |
|---|---|
| Qwen2.5-VL (VLM) | 7B |
| VAE (Enc/Dec) | 54M / 73M |
| MMDiT | 20B |
| Total | ~27.1B |
[Text, Image]
[Image]
Raster image formats (e.g., png, jpg, jpeg) via VAE decoding.
Two-Dimensional (2D), with configurable resolution (supports aspect ratios: 1:1, 16:9, 9:16, 4:3, 3:4, 3:2, 2:3).
Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA’s hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
Runtime Engines:
Supported Hardware Microarchitecture Compatibility:
Supported Operating Systems:
The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.
This model can generate synthetic images and may produce content that is inaccurate, offensive, or otherwise inappropriate. Users should implement robust safety guardrails — including content filtering, abuse monitoring, and access controls— to reduce the risk of harmful outputs. Users are responsible for ensuring that their use of the model complies with all applicable laws and regulations, and for regularly reviewing and updating their guardrails as risks evolve.
For more information about the implementation of Cosmos pre and post guardrails to improve model safety, please see the Cosmos-1.0 Guardrail Model.
Engine: SGLang Diffusion
Test Hardware: H100
NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal developer team to ensure these software components meet requirements for the relevant industry and use case and address unforeseen product misuse.
Please make sure you have proper rights and permissions for all input image and video content; if image or video includes people, personal health information, or intellectual property, the image or video generated will not blur or maintain proportions of image subjects included.
Users are responsible for model inputs and outputs. Users are responsible for ensuring safe integration of this model, including implementing guardrails as well as other safety mechanisms, prior to deployment.
Please report security vulnerabilities or NVIDIA AI Concerns here.
Deploying and integrating the NIM is straightforward thanks to our industry standard APIs. Visit the Visual Generative AI NIM page for release documentation, deployment guides and more.
Get access to knowledge base articles and support cases or submit a ticket.