Explore

Models

Skills

Blueprints

GPUs

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint4

Partner Endpoint3

Download Available4

Use Case

Image-to-Text3Drug Discovery0Retrieval Augmented Generation0Speech-to-Text0Image Generation0

Inference Providers

OpenRouter3Together AI3Deepinfra2GMI Cloud2Bitdeer1

Publisher

Mistral AI2Meta1Moonshotai1Thinkingmachines1NVIDIA0

NIM Container GPUs

H100 80GB HBM30A100 SXM4 80GB0L40S0A10G0B2000

Labels (1)

Multimodal

5 models

Sort By

Thinkingmachines

DownloadableFree Endpoint

inkling

Inkling is a multimodal (text + image) reasoning model from Thinking Machines — a Mamba-hybrid, 256-expert Mixture-of-Experts architecture with tool use and switchable reasoning.

text-to-text

reasoning image-to-text multimodal

Last updated on July 16, 2026

Items per page

of 1 pages

Moonshotai

DownloadableFree Endpoint

kimi-k2.6

1T multimodal MoE for long-horizon coding, agentic tool use, and image/video understanding.

Multimodal

Mixture-of-Experts Reasoning Image-to-Text

16M API calls in the last 30 days

Last updated on May 1, 2026

Mistral AI

Downloadable

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

language generation

multimodal agentic Image-to-Text

2M API calls in the last 30 days

Last updated on December 2, 2025

Mistral AI

DownloadableFree Endpoint

ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

language generation

SLM multimodal Image-to-Text

4M API calls in the last 30 days

Last updated on December 2, 2025

llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal Safety

Content Safety Guardrail Content Moderator

357K API calls in the last 30 days

Last updated on July 1, 2025