NVIDIA
Explore Models Blueprints GPUs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

microsoft

florence-2

PREVIEW

Vision foundation model capable of performing diverse computer vision and vision language tasks.

language generationmultimodalvision assistantvisual question answeringcomputer visioncvimageimage classificationimage-to-textobject detectiontext-to-imagevlm
Get API Key
API Reference
Accelerated by DGX Cloud

Input

GOVERNING TERMS: Your use of this API is governed by the NVIDIA API Trial Service Terms of Use; and the use of this model is governed by the MIT License.

Output