NVIDIA
Explore
Models
Blueprints
GPUs
Docs
⌘KCtrl+K
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2026 NVIDIA Corporation

microsoft

kosmos-2

DeprecatedAPI Endpoint

Groundbreaking multimodal model designed to understand and reason about visual elements in images.

Image UnderstandingMultimodalVisual Question Answeringcomputer visioncvimageImage-to-Textvideovlm
Get API Key
This NIM has been deprecated