⌘KCtrl+K

Your Privacy Choices

Copyright © 2026 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices

Optimized by NVIDIA Launch from Hugging FaceBeta

Filters (1)

Free Endpoint

1

Partner Endpoint

1

Download Available

1

Use Case

Image-to-Text

1

Code Generation

1

Retrieval Augmented Generation

0

Drug Discovery

0

Speech-to-Text

0

Inference Providers

CoreWeave

1

Deep Infra

0

Together AI

0

Bitdeer AI

0

GMI Cloud

0

Publisher

Meta

1

Microsoft

1

NVIDIA

0

Mistral AI

0

Google

0

GPU Types

A100 SXM4 80GB

0

B200

0

GB200

0

GH200 144G HBM3e

0

H100 80GB HBM3

0

Labels (1)

language generation

2 models

Sort By

Free Endpoint

llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

language generation

9mo

Items per page

of 1 pages

12.54M

Downloadable

phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

532K

11mo