NVIDIA
Explore Models Blueprints GPUs Docs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Publisher
Use Case
NIM Type
Sorting by Most Recent

metallama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

llm multimodal safetycontent safetyguardrailcontent moderatormeta

googlegemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generationspeech recognitionvisual qachatgoogle

googlegemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generationspeech recognitionvisual qachatgoogle

nvidiacosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

synthetic data generationautonomous vehiclesphysical airoboticsvideo-to-worldnvidia

nvidiallama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

nemo retrieverembeddingretrieval augmented generationtext-to-embeddingnvidia

deepseek-aideepseek-r1-0528

Updated version of DeepSeek-R1 with enhanced reasoning, coding, math, and reduced hallucination.

codingchatmathadvanced reasoningdeepseek-ai

speakleashbielik-11b-v2.3-instruct

State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.

polishsovereign aichatchatbotssummarizationspeakleash

qwenqwen3-235b-a22b

Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following

complex mathadvanced reasoninginstruction followingqwen

black-forest-labsFLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

image generationtext-to-imagerun-on-rtxblack-forest-labs

utter-projecteurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

sovereign aichatchattext-to-textmultilingualeuropeanregional language generationutter-project

gotocompanygemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

sovereign aichatindonesianchattext-to-textregional language generationgotocompany

mistralaimistral-small-3.1-24b-instruct-2503

Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses

language generationmultimodalimage understandingmistralai

nvidiacosmos-predict1-7b

Generalist model to generate future world state as videos from text and image prompts to create synthetic training data for robots and autonomous vehicles.

synthetic data generationautonomous vehiclesphysical airoboticstext-to-worldimage-to-worldnvidia

nvidiacosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

synthetic data generationphysical aipolicy evaluationroboticsvideo-to-worldnvidia

nvidiasparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

autonomous vehiclesbevav stackautomotivenvidia

nvidiallama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

mathadvanced reasoninginstruction followingfunction callingnvidia

deepseek-aideepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

distillationcodingchatreasoningrun-on-rtxmathdeepseek-ai

deepseek-aideepseek-r1-distill-qwen-32b

Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.

codingdistillationchatreasoningmathdeepseek-ai

deepseek-aideepseek-r1-distill-qwen-14b

Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.

codingdistillationchatreasoningmathdeepseek-ai

deepseek-aideepseek-r1-distill-qwen-7b

Distilled version of Qwen 2.5 7B using reasoning data generated by DeepSeek R1 for enhanced performance.

codingdistillationchatreasoningmathdeepseek-ai

microsoftphi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generationchattext-to-textlanguage generationmicrosoft

deepseek-aideepseek-r1

State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.

chatmathadvanced reasoningdeepseek-ai

nvidiallama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

dialogue safetyllm safetyguard modelcontent safetynvidia

nvidiallama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

llm safetycontent moderationguard modelcontent safetynvidia

university-at-buffalocached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retrieverchart element detectionimage-to-textuniversity-at-buffalo

baidupaddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

optical character recognitiontable extractionoptical character detectionnemo retrieverdata ingestionrun-on-rtxextractionbaidu

nvidiacorrdiff

Generative downscaling model for generating high resolution regional scale weather fields.

ai weather predictionweather simulationearth-2nvidia

nvidiafourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

weather simulationai weather predictionclimate scienceearth-2nvidia

hivedeepfake-image-detection

Advanced AI model detects faces and identifies deep fake images.

computer visionai safetydeep fake detectioncontent moderationhive

institute-of-science-tokyollama-3.1-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

sovereign ailarge language modelchatregional language generationinstitute-of-science-tokyo

institute-of-science-tokyollama-3.1-swallow-8b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

sovereign ailarge language modelchatchatregional language generationinstitute-of-science-tokyo

nvidiallama-3.1-nemotron-51b-instruct

Unique language model that delivers an unmatched accuracy-efficiency performance.

chatlanguage generationchattext-to-textnvidia

hiveai-generated-image-detection

Robust image classification model for detecting and managing AI-generated content.

image classificationcomputer visionai safetycontent moderationhive

yentinglinllama-3-taiwan-70b-instruct

Sovereign AI model finetuned on Traditional Mandarin and English data using the Llama-3 architecture.

regional language generationchatcode generationlarge language modelsyentinglin

tokyotech-llmllama-3-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

large language modelchatregional language generationtokyotech-llm

ai21labsjamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chatchatlanguage generationtext-to-textai21labs

ai21labsjamba-1.5-large-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chatchatlanguage generationtext-to-textai21labs

microsoftphi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generationchattext-to-textlanguage generationlarge language modelsmicrosoft

nvidianv-grounding-dino

Grounding dino is an open vocabulary zero-shot object detection model.

object detectioncomputer visiondeepstreamnvidia nimnvidia

briaaiBRIA-2.3

An enterprise-grade text-to-image model trained on a compliant dataset produces high quality images.

image generationtext-to-imagebriaai

googlegemma-2-2b-it

Advanced small language generative AI model for edge applications

chatcode generationchattext-to-textlanguage generationgoogle

nvidiausdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

openusdsynthetic data generationdigital twinusdtext-to-3dnvidia nimnvidia

01-aiyi-large

Powerful model trained on English and Chinese for diverse tasks including chatbot and creative writing.

chatcode generationchattext-to-textmultilingual01-ai

nvidiaretail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

object detectionimagecvvlmcomputer visiontao toolkitvideonvidia nimnvidia

googlepaligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

imagecvvision assistantvlmvisual question answeringcomputer visionlanguage generationimage-to-textvideogoogle

aisingaporesea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

chattext-to-textregional language generationlarge language modelsaisingapore

mistralaimixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoningchatcode generationchattext-to-textlarge language modelsmistralai

stabilityaistable-video-diffusion

Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.

image generationtext-to-imagestabilityai

mistralaimixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoningchatcode generationchattext-to-textlarge language modelsmistralai