NVIDIA
Explore Models Blueprints GPUs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

microsoft

phi-4-multimodal-instruct

PREVIEW

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

chart and table understandinglanguage generationspeech recognitionvisual qaimage-to-text
Get API Key
API Reference
Accelerated by DGX Cloud
Sorry, your browser does not support inline SVG.
I can evaluate images and discuss them with you!

GOVERNING TERMS: The trial service is governed by the NVIDIA API Trial Terms of Service; the use of this model is governed by the NVIDIA Community Model License. Additional Information: MIT License.

Using free API for development