microsoft/florence-2

PREVIEW

Vision foundation model capable of performing diverse computer vision and vision language tasks.

Input

Output