Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Contact

Explore

⌘KCtrl+K

Search Results

Searching for: nim

Sort By

Publisher

Use Case

NIM Type

Blueprint Type

GPU Types

Launchable

Sorting by Last Updated

moonshotai kimi-k2.5

1T multimodal MoE for high‑capacity video and image understanding with efficient inference.

Multimodal Reasoning chat Mixture-of-Experts Image-to-Text

z-ai glm4.7

GLM-4.7 is a multilingual agentic coding partner with stronger reasoning, tool use, and UI skills.

Tool Calling Coding Reasoning chat Multilingual

nvidia nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NeMo Guardrails Nemotron reasoning Safety and Moderation

nvidia Multi-Agent Intelligent Warehouse

An AI-powered, multi-agent system designed to optimize warehouse operations through intelligent automation, real-time monitoring, and natural language interaction.

blueprint nemo retriever nim Launchable Retrieval-Augmented Generation NVIDIA AI

nvidia Retail Catalog Enrichment

A GenAI system that enhances and localizes product catalogs with rich text content and imagery.

blueprint nim Launchable NVIDIA AI

nvidia cosmos-reason2-8b

Vision language model that excels in understanding the physical world using structured reasoning on videos or images.

video understanding Synthetic Data Generation autonomous vehicles industrial Physical AI vision language model reasoning robotics smart cities

nvidia nemoretriever-page-elements-v3

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion

deepseek-ai deepseek-v3.2

State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.

long context text-to-text chat reasoning

nvidia nemotron-3-nano-30b-a3b

Open, efficient MoE model with 1M context, excelling in coding, reasoning, instruction following, tool calling, and more

MoE Reasoning chat Long Context Instruction Following

nvidia riva-translate-4b-instruct-v1_1

Translation model in 12 languages with few-shots example prompts capability.

nvidia nim Text Translation neural machine translation

mistralai devstral-2-123b-instruct-2512

State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.

coding chat reasoning text-to-code agentic

moonshotai kimi-k2-thinking

Open reasoning model with 256K context window, native INT4 quantization and enhanced tool use

Conversational Reasoning chat Long Context Function Calling

mistralai mistral-large-3-675b-instruct-2512

A state-of-the-art general purpose MoE VLM ideal for chat, agentic and instruction based use cases.

language generation chat Image-to-Text multimodal agentic

mistralai ministral-14b-instruct-2512

A general purpose VLM ideal for chat and instruction based use cases

language generation SLM chat Image-to-Text multimodal

nvidia AI Model Distillation for Financial Data

Distill and deploy domain-specific AI models from unstructured financial data to generate market signals efficiently—scaling your workflow with the NVIDIA Data Flywheel Blueprint for high-performance, cost-efficient experimentation.

blueprint developer example nim nvidia ai Launchable Nemotron algorithmic trading llm financial services data flywheel

nvidia streampetr

StreamPETR offers efficient 3D object detection for autonomous driving by propagating sparse object queries temporally.

autonomous vehicles bev AV Stack automotive

minimaxai minimax-m2

Open Mixture of Experts LLM (230B, 10B active) for reasoning, coding, and tool-use/agent workflows

Conversational Reasoning chat Long Context Function Calling

nvidia Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

agent blueprint blueprint nim Launchable nemo llm NVIDIA AI

nvidia nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

text and table extraction document parsing supported language - english

nvidia nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

language generation chat Image-to-Text vision assistant visual question answering

nvidia llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderation llm safety multilingual guard model multilingual content safety nemoguard

cyborg Cyborg Enterprise RAG

Securely extract, embed, and index multimodal data with encryption in-use for fast, accurate semantic search.

NIM Launchable Blueprint Retrieval-Augmented Generation NeMo Retriever

openfold openfold3

OpenFold3 is a third-generation biomolecular foundation model that predicts the three-dimensional structures of molecular complexes (proteins, DNA, RNA, ligands)

Biology Drug Discovery Protein Folding

nvidia parakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

ASR Streaming Taiwanese Speech-to-Text NVIDIA NIM

deepseek-ai deepseek-v3.1-terminus

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

tool calling chat advanced reasoning agentic

nvidia llama-3_2-nemoretriever-300m-embed-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Retrieval Augmented Generation Text-to-Embedding NeMo Retriever

stockmark stockmark-2-100b-instruct

Japanese-specialized large-language-model for enterprises to read and understand complex business documents.

sovereign ai japanese stockmark chat large language model

qwen qwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

chat text-generation agentic

moonshotai kimi-k2-instruct-0905

Follow-on version of Kimi-K2-Instruct with longer context window and enhanced reasoning capabilities

long-context coding chat advanced reasoning agentic

speakleash bielik-11b-v2.6-instruct

State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.

Polish Sovereign AI chat Chatbots Summarization

qwen qwen3-next-80b-a3b-thinking

80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.

Reasoning chat Text-to-Text

nvidia parakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

ASR Streaming Speech-to-Text Mandarin NVIDIA NIM

nvidia parakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

ASR Streaming Speech-to-Text Spanish NVIDIA NIM

nvidia parakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

ASR Streaming Speech-to-Text Vietnamese NVIDIA NIM

bytedance seed-oss-36b-instruct

ByteDance open-source LLM with long-context, reasoning, and agentic intelligence.

thinking budget chat reasoning text-generation

microsoft TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

text-to-3d Run-on-RTX image-to-3d

qwen qwen3-coder-480b-a35b-instruct

Excels in agentic coding and browser use and supports 256K context, delivering top results.

agentic coding moe long context chat browser use

nvidia Retail Shopping Assistant

Elevate Shopping Experiences Online and In Stores.

blueprint nemo retriever nim Launchable Retrieval-Augmented Generation NVIDIA AI

deepseek-ai deepseek-v3.1

DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.

Reasoning chat Text-to-Text

nvidia nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

thinking budget chat reasoning

stabilityai stable-diffusion-3.5-large

Stable Diffusion 3.5 is a popular text-to-image generation model

Image Generation Text-to-Image

black-forest-labs FLUX.1-Kontext-dev

FLUX.1 Kontext is a multimodal model that enables in-context image generation and editing.

Image Generation Text-to-Image Run-on-RTX

nvidia cosmos-reason1-7b

Reasoning vision language model (VLM) for physical AI and robotics.

video understanding Synthetic Data Generation autonomous vehicles industrial Physical AI vision language model reasoning robotics smart cities

nvidia nemoretriever-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

Optical Character Recognition Table Extraction nemo retriever data ingestion extraction

openai gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

text-to-text chat reasoning math

openai gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

text-to-text chat reasoning math

nvidia parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

ASR English NVIDIA NIM NVIDIA Riva speech-to-text

nvidia Streaming Data to RAG

Sensor-captured radio enables real-time awareness, AI-driven analytics for actionable, searchable insights.

blueprint NIM Riva Launchable RAG NVIDIA AI NeMo Retriever

nvidia llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

chat math advanced reasoning instruction following function calling

opengpt-x teuken-7b-instruct-commercial-v0.4

Multilingual 7B LLM, instruction-tuned on all 24 EU languages for stable, culturally aligned output.

sovereign ai text-to-text chat european Multilingual

sarvamai sarvam-m

Multilingual, hybrid-reasoning model optimized for Indian language tasks, programming, mathematical reasoning capabilities.

coding indic languages hybrid chat reasoning math multilingual

nvidia llama-3_2-nemoretriever-300m-embed-v1

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Retrieval Augmented Generation Text-to-Embedding NeMo Retriever

nvidia nemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

Optical Character Recognition Table Extraction nemo retriever data ingestion extraction

nvidia nv-embed-v1

Generates high-quality numerical embeddings from text inputs.

Non-Commercial Use Only Retrieval Augmented Generation Text-to-Embedding

ipd proteinmpnn

ProteinMPNN is a deep learning model for predicting amino acid sequences for protein backbones.

biology nim BioNemo Drug Discovery Protein Generation

nvidia llama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

nemo retriever Retrieval Augmented Generation reranking

nvidia nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

Embedding run-on-rtx Retrieval Augmented Generation Nemo retriever Text-to-Embedding

nvidia Build A Generative Virtual Screening Pipeline

This blueprint shows how generative AI and accelerated NIM microservices can design optimized small molecules smarter and faster.

Chemistry NIM NVIDIA BioNemo Blueprint Enterprise BioNemo Docking Drug Discovery

qwen qwen3-235b-a22b

Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following

chat complex math advanced reasoning instruction following

nvidia genmol

Fragment-Based Molecular Generation by Discrete Diffusion.

Chemistry nim BioNemo Molecule Generation Drug Discovery

nvidia llama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

nemo retriever embedding Retrieval Augmented Generation Text-to-Embedding

nvidia Build A Generative Protein Binder Design Pipeline

This blueprint shows how generative AI and accelerated NIM microservices can design protein binders smarter and faster.

NVIDIA BioNemo Blueprint Enterprise BioNemo Biology Drug Discovery Protein Generation

nvidia Evo 2 Protein Design

This workflow shows how generative AI can generate DNA sequences that can be translated into proteins for bioengineering.

blueprint NIM biology BioNemo Drug Discovery Protein Generation

nvidia llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

chat math advanced reasoning instruction following function calling

nvidia sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

autonomous vehicles bev av stack automotive

nvidia bevformer

Advanced transformer for multi-frame bird's-eye-view 3D perception in autonomous driving.

autonomous vehicles bev automotive perception

microsoft phi-4-mini-flash-reasoning

Lightweight reasoning model for applications in latency bound, memory/compute constrained environments

edge chat reasoning text-generation math

nvidia molmim

MolMIM performs controlled generation, finding molecules with the right properties.

Chemistry nim BioNemo Molecule Generation Drug Discovery

moonshotai kimi-k2-instruct

State-of-the-art open mixture-of-experts model with strong reasoning, coding, and agentic capabilities

coding chat advanced reasoning agentic

mistralai mixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced Reasoning chat Code Generation Text-to-Text Large Language Models

mistralai mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced Reasoning chat Code Generation Text-to-Text Large Language Models

meta llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

language generation chat Image-to-Text vision assistant visual question answering

nvidia Build an AI Agent for Enterprise Research

Build a custom enterprise research assistant powered by state-of-the-art models that process and synthesize multimodal data, enabling reasoning, planning, and refinement to generate comprehensive reports.

NIM Launchable Llama Nemotron Reasoning Blueprint Enterprise Retrieval-Augmented Generation NVIDIA AI NeMo Retriever

meta llama-4-scout-17b-16e-instruct

A multimodal, multilingual 16 MoE model with 17B parameters.

language generation chat Image-to-Text vision assistant visual question answering

thudm chatglm3-6b

Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.

Text Translation chat Code Generation Text-to-Text Regional Language Generation

google gemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generation speech recognition Visual QA chat

google gemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generation speech recognition Visual QA chat

mit diffdock

Predicts the 3D structure of how a molecule interacts with a protein.

Chemistry nim BioNemo Docking Drug Discovery

mistralai mistral-medium-3-instruct

Powerful, multimodal language model designed for enterprise applications, including software development, data analysis, and reasoning.

language generation chat Image-to-Text multimodal visual question answering

mistralai magistral-small-2506

High performance reasoning model optimized for efficiency and edge deployment

coding chat math advanced reasoning multilingual

nvidia llama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

chat math advanced reasoning instruction following function calling

ibm granite-3.3-8b-instruct

Small language model fine-tuned for improved reasoning, coding, and instruction-following

coding Reasoning chat Instruction Following

nvidia Multi-LLM NIM

Use the multi-LLM compatible NIM container to deploy a broad range of LLMs from Hugging Face.

Blueprint NVIDIA AI

nvidia magpie-tts-flow

Expressive and engaging text-to-speech, generated from a short audio sample.

TTS Text-to-Speech NVIDIA NIM NVIDIA Riva

nvidia Build an Enterprise RAG Pipeline Blueprint

Power fast, accurate semantic search across multimodal enterprise data with NVIDIA’s RAG Blueprint—built on NeMo Retriever and Nemotron models—to connect your agents to trusted, authoritative sources of knowledge.

NIM Launchable Nemotron Blueprint Enterprise Retrieval-Augmented Generation NVIDIA AI NeMo Retriever

nvidia nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Data ingestion Chart Detection nemo retriever Table Detection run-on-rtx extraction

baidu paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

Optical Character Recognition Table Extraction Optical Character Detection nemo retriever data ingestion run-on-rtx extraction

meta llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation Run-on-RTX

deepseek-ai deepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

Distillation coding chat reasoning run-on-rtx math

mit Boltz-2

Predict complex structures using Boltz-2.

nim Bionemo Biology Drug Discovery Protein Folding

nvidia usdcode

State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

OpenUSD Synthetic Data Generation Digital Twin chat Code Generation

qwen qwen2.5-coder-32b-instruct

Advanced LLM for code generation, reasoning, and fixing across popular programming languages.

code completion code generation chat text-to-code

nvidia nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

nemo guardrails llm security NIM Prompt Injection Safety and Moderation LLM Safety nemotron

nvidia llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

doc intelligence chat multiple image understanding OCR

nvidia Refine AI Agents through Continuous Model Distillation with Data Flywheels

Build a data flywheel, with NVIDIA NeMo microservices, that continuously optimizes AI agents for latency and cost — while maintaining accuracy targets.

NIM Launchable Data Flywheel Blueprint Enterprise NeMo microservices NVIDIA AI

meta llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal Safety Content Safety Guardrail Content Moderator

nvidia llama-3.1-nemotron-nano-4b-v1.1

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

edge tool calling chat reasoning math

nvidia cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

Synthetic Data Generation Autonomous Vehicles Physical AI robotics video-to-world

mistralai mistral-small-24b-instruct

Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.

code chat reasoning agent-centric multilingual

nvidia llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

chat math advanced reasoning instruction following function calling

nvidia parakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

ASR Streaming English Speech-to-Text batch NVIDIA NIM

nvidia magpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

TTS Text-to-Speech NVIDIA NIM NVIDIA Riva multilingual

nvidia riva-translate-1.6b

Enable smooth global interactions in 36 languages.

Text Translation Neural machine translation NVIDIA NIM

nvidia llama-3.2-nemoretriever-500m-rerank-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nemo retriever Retrieval Augmented Generation reranking

nvidia llama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

nemo retriever embedding Retrieval Augmented Generation Text-to-Embedding

qwen qwq-32b

Powerful reasoning model capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems.

coding chat math advanced reasoning

gotocompany gemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

Sovereign AI chat Indonesian Text-to-Text Regional Language Generation

nvidia AI Agent for Telecom Network Configuration Planning

Automate and optimize the configuration of radio access network (RAN) parameters using agentic AI and a large language model (LLM)-driven framework.

nim Launchable Blueprint simulation Telecommunications NVIDIA AI

utter-project eurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

Sovereign AI chat Text-to-Text Multilingual European Regional Language Generation

colabfold msa-search

Generates a multiple sequence alignment from a query sequence and a protein sequence database search.

nim Bionemo Biology Drug Discovery Protein Folding

nvidia audio2face-3d

Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.

Speech-to-Animation Digital Humans Audio-to-Face NVIDIA NIM

nvidia Background Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

Nvidia Maxine Speech-to-speech Digital Human Speech Enhancement

nvidia nvclip

NV-CLIP is a multimodal embeddings model for image and text.

Computer vision multimodal embeddings text and image Run-on-rtx

nvidia parakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

ASR Streaming English Batch Speech-to-Text Fast NVIDIA NIM Run-on-RTX

nvidia studiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

Nvidia Maxine Speech-to-speech Digital Human Run-on-RTX Speech Enhancement

black-forest-labs FLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

Image Generation Text-to-Image Run-on-RTX

black-forest-labs FLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

Image Generation Text-to-Image Run-on-RTX

meta esmfold

Predicts the 3D structure of a protein from its amino acid sequence.

biology nim Bionemo protein folding Drug Discovery

meta llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

Reasoning chat Code Generation Text-to-Text Instruction following Math

meta llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

chat Code Generation Text-to-Text Language Generation

nvidia magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

TTS Text-to-Speech NVIDIA NIM NVIDIA Riva

mistralai mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

language generation chat instruction following function calling

mistralai mistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

chat Text-to-Text Language Generation

nvidia nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

optical character recognition nemo retriever data ingestion table extraction supported language - english

meta llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

Image-Text Retrieval Visual QA image captioning chat Image-to-Text Visual Grounding

meta llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

Image-Text Retrieval Visual QA chat Image-to-Text Image Captioning Visual Grounding

nvidia nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

nemo retriever Embedding Retrieval Augmented Generation

openfold openfold2

Predicts the 3D structure of a protein from its amino acid sequence, multiple sequence alignments, and templates.

nim Bionemo Biology Drug Discovery Protein Folding

arc evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

DNA Generation biology nim Bionemo Drug Discovery

ipd rfdiffusion

A generative model of protein backbones for protein binder design.

biology nim BioNemo Drug Discovery Protein Generation

google gemma-3-27b-it

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Vision Assistant chat Visual Question Answering Language Generation Image-to-Text

microsoft phi-4-multimodal-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.

Speech Recognition Visual QA chat Language Generation Image-to-Text Chart and Table Understanding

microsoft phi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chat Code Generation Text-to-Text Language Generation Large Language Models

microsoft phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chat Code Generation Text-to-Text Language Generation

tiiuae falcon3-7b-instruct

Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities

Coding chat Code Generation Language Generation Improved reasoning Math Scientific knowledge

nvidia llama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-text chat Non-Commercial Use Only

google gemma-2-9b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chat Code Generation Text-to-Text Language Generation

mediatek breeze-7b-instruct

LLM for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

chat Text-to-Text Regional Language Generation

igenius colosseum_355b_instruct_16k

NVIDIA DGX Cloud trained multilingual LLM designed for mission critical use cases in regulated industries including financial services, government, heavy industry

Heavy industry Government chat Highly regulated use case support Financial services

deepseek-ai deepseek-r1-distill-qwen-14b

Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.

coding distillation chat reasoning math

google gemma-2-27b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chat Code Generation Text-to-Text Language Generation

igenius italia_10b_instruct_16k

Multilingual LLM with emphasis on European languages supporting regulated use cases including financial services, government, heavy industry

Heavy industry Government chat Highly regulated use case support Financial services

google gemma-3-1b-it

A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications

Translation chat Text-to-Text Language Generation

deepseek-ai deepseek-r1-distill-qwen-32b

Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.

coding distillation chat reasoning math

google gemma-2-2b-it

Advanced small language generative AI model for edge applications

chat Code Generation Text-to-Text Language Generation

deepseek-ai deepseek-r1-distill-qwen-7b

Distilled version of Qwen 2.5 7B using reasoning data generated by DeepSeek R1 for enhanced performance.

coding distillation chat math

baichuan-inc baichuan2-13b-chat

Support Chinese and English chat, coding, math, instruction following, solving quizzes

Chinese Language Generation Text Translation chat Text-to-Text

abacusai dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

chat Code Generation Text-to-Text

rakuten rakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat Text-to-Text Language Generation Large Language Models

microsoft phi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

rakuten rakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat Text-to-Text Language Generation Large Language Models

microsoft phi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chat Code Generation Text-to-Text Language Generation Large Language Models

qwen qwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language Generation chat Text-to-Text Large Language Models

microsoft phi-3-small-8k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

qwen qwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language Generation chat Text-to-Text Large Language Models

microsoft phi-3-medium-4k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

qwen qwen2.5-coder-7b-instruct

Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.

code completion code generation chat text-to-code

microsoft phi-3-medium-128k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

marin marin-8b-instruct

State-of-the-art open model trained on open datasets, excelling in reasoning, math, and science.

Reasoning chat Science Open Model Math

meta llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

Indic chat Text-to-Text Language Generation

mistralai mistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

chat Text-to-Text Language Generation NVIDIA NIM

meta llama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation Large Language Models

mistralai mamba-codestral-7b-v0.1

Model for writing and interacting with code across a wide range of programming languages and tasks.

code completion code generation chat

meta llama3-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

chat Large Language models Code Generation Text-to-Text Language Generation

yentinglin llama-3-taiwan-70b-instruct

Sovereign AI model finetuned on Traditional Mandarin and English data using the Llama-3 architecture.

regional language generation chat Code Generation Large Language Models

ai21labs jamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chat Language Generation Text-to-text

nvidia cuopt

World-record accuracy and performance for complex route optimization.

Route Optimization

institute-of-science-tokyo llama-3.1-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

Sovereign AI Large Language Model chat Regional Language Generation

tokyotech-llm llama-3-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

Large Language Model chat Regional Language Generation

institute-of-science-tokyo llama-3.1-swallow-8b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

Sovereign AI Large Language Model chat Regional Language Generation

langchain Structured Report Generation

Generate detailed, structured reports on any topic using LangGraph and Llama3.3 70B NIM.

LangGraph Report Generation Launchable AI Agents Blueprint Partner NVIDIA AI

meta llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation

mistralai mistral-small-3.1-24b-instruct-2503

Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses

language generation chat multimodal image understanding

google gemma-7b

Cutting-edge text generation model text understanding, transformation, and code generation.

chat Code Generation Text-to-Text Language Generation

pipecat Voice Agent Framework for Conversational AI

Automate voice AI agents with NVIDIA NIM microservices and Pipecat.

Pipecat Launchable AI Agents Blueprint Conversational AI Partner NVIDIA AI

crewai Code Documentation for Software Development

Document your github repositories with AI Agents using CrewAI and Llama3.3 70B NIM.

Code Documentation CrewAI Launchable AI Agents Blueprint Partner NVIDIA AI

nvidia parakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

Automatic Speech Recognition Speech-to-Text NVIDIA NIM NVIDIA Riva

nvidia vista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

Interactive Annotation Image Segmentation Non-Commercial Use Only Medical Imaging

baai bge-m3

Embedding model for text retrieval tasks, excelling in dense, multi-vector, and sparse retrieval.

Embeddings Retrieval Augmented Generation Text-to-Embedding

nvidia megatron-1b-nmt

Enable smooth global interactions in 36 languages.

Text Translation Neural machine translation NVIDIA NIM

openai whisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

ASR AST Speech-to-Text batch whisper OpenAI Multilingual NVIDIA NIM NVIDIA Riva

nvidia canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Automatic Speech Recognition Automatic Speech Translation NVIDIA NIM NVIDIA Riva

upstage solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

Non-Commercial Use Only chat Text-to-Text Language Generation Large Language Models

nvidia nv-grounding-dino

Grounding dino is an open vocabulary zero-shot object detection model.

Object Detection computer vision deepstream NVIDIA NIM

nvidia nv-dinov2

NV-DINOv2 is a visual foundation model that generates vector embeddings for the input image.

Image-to-Embedding computer vision deepstream NVIDIA NIM object Classification

nvidia llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

nemo guardrails LLM safety Safety and moderation dialogue safety nemotron

nvidia llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

nemo guardrails LLM safety Safety and moderation dialogue safety nemotron

nvidia corrdiff

Generative downscaling model for generating high resolution regional scale weather fields.

AI Weather prediction Weather Simulation Earth-2

nvidia fourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

Weather Simulation AI Weather Prediction Climate science Earth-2

hive deepfake-image-detection

Advanced AI model detects faces and identifies deep fake images.

computer vision AI safety deep fake detection Content moderation

hive ai-generated-image-detection

Robust image classification model for detecting and managing AI-generated content.

image classification computer vision AI safety Content moderation

deepmind alphafold2

Predicts the 3D structure of a protein from its amino acid sequence.

nim Bionemo Biology protein folding Drug Discovery

deepmind alphafold2-multimer

Predicts the 3D structure of a protein from its amino acid sequence.

nim Bionemo Biology protein folding Drug Discovery

nvidia eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

telepresence Nvidia Maxine Digital Human

nvidia maisi

MAISI is a pre-trained volumetric (3D) CT Latent Diffusion Generative Model.

Image Generation Medical Imaging NVIDIA NIM

nvidia cosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

Synthetic Data Generation Physical AI policy evaluation robotics video-to-world

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion run-on-rtx

meta llama-3.1-405b-instruct

Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks.

synthetic data generation chat Code Generation

meta esm2-650m

Generates embeddings of proteins from their amino acid sequences.

nim Protein Embedding BioNemo Biology Drug Discovery

nvidia cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

VLM Vision language model image caption image to text

microsoft phi-3.5-vision-instruct

Cutting-edge open multimodal model exceling in high-quality reasoning from images.

Vision Assistant Visual Question Answering Language Generation Image-to-Text

microsoft phi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chat Code Generation Text-to-Text Language Generation Large Language Models

ibm granite-guardian-3.0-8b

Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior

Guardrail Text-to-text

google shieldgemma-9b

Guardrail model to ensure that responses from LLMs are appropriate and safe

Guardrail Text-to-Text

aisingapore sea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

Chat Text-to-Text Regional Language Generation Large Language Models

bigcode starcoder2-7b

Advanced programming model for code completion, summarization, and generation

code completion code generation

nvidia rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

Ranking Retrieval Augmented Generation

university-at-buffalo cached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retriever Chart Element Detection Image-To-Text

nvidia usdvalidate

Verify compatibility of OpenUSD assets with instant RTX render and rule-based validation.

Validation OpenUSD Synthetic Data Generation Digital Twin USD Visualization 3D

nvidia llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

Text-to-text Reward Model RLHF

stabilityai stable-diffusion-3-medium

Advanced text-to-image model for generating high quality images

Image Generation Text-to-Image

nvidia usdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

OpenUSD Synthetic Data Generation Digital Twin USD Text-to-3D

nvidia visual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

image Image Generation cv Image Segmentation vlm computer vision TAO Toolkit video NVIDIA NIM

nvidia retail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

Object Detection image cv vlm computer vision TAO Toolkit video NVIDIA NIM

nvidia ocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

Optical Character Recognition image Optical Character Detection cv vlm computer vision TAO Toolkit video

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chat Text-to-Text Language Generation

nvidia mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generation text-to-text chat small language model

google paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

image cv Vision Assistant vlm Visual Question Answering computer vision Language Generation Image-to-Text video