Explore

Models

Skills

Blueprints

GPUs

Docs

Your Privacy Choices

Contact

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

Partner Endpoint

Download Available

Use Case

Image-to-Text

Drug Discovery

Retrieval Augmented Generation

Speech-to-Text

Code Generation

Inference Providers

Deepinfra

OpenRouter

Together AI

GMI Cloud

Bitdeer

Publisher

Mistral AI

Google

Qwen

Stepfun ai

NVIDIA

NIM Container GPUs

B200

H200

L40S

H100 80GB HBM3

A100 SXM4 80GB

Labels (1)

Agentic

5 models

Sort By

Mistral AI

DownloadableFree Endpoint

mistral-medium-3.5-128b

A high performing model for text generation, coding and agentic use cases

coding

2mo

Items per page

of 1 pages

Google

DownloadableFree Endpoint

gemma-4-31b-it

Dense 31B model delivering frontier reasoning for coding, agentic workflows, and fine-tuning.

reasoning

3mo

Qwen

DownloadableFree Endpoint

qwen3.5-397b-a17b

Next-gen Qwen 3.5 VLM (400B MoE) brings advanced vision, chat, RAG, and agentic capabilities.

MoE

13M

4mo

Stepfun-ai

Free Endpoint

step-3.5-flash

200B open-source reasoning engine with sparse MoE powering frontier agentic AI.

Agentic

10M

5mo

Mistral AI

Free Endpoint

mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

language generation

7mo