Try NVIDIA NIM APIs

⌘KCtrl+K

Your Privacy Choices

Contact

Explore

⌘KCtrl+K

Search Results

Searching for: TAO Toolkit

Sort By

Publisher

Use Case

NIM Type

Blueprint Type

GPU Types

Launchable

Sorting by Last Updated

nvidia nemotron-content-safety-reasoning-4b

A context‑aware safety model that applies reasoning to enforce domain‑specific policies.

NeMo Guardrails Nemotron reasoning Safety and Moderation

nvidia nemoretriever-page-elements-v3

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion

deepseek-ai deepseek-v3.2

State-of-the-art 685B reasoning LLM with sparse attention, long context, and integrated agentic tools.

long context text-to-text chat reasoning

mistralai devstral-2-123b-instruct-2512

State-of-the-art open code model with deep reasoning, 256k context, and unmatched efficiency.

coding chat reasoning text-to-code agentic

stockmark stockmark-2-100b-instruct

Japanese-specialized large-language-model for enterprises to read and understand complex business documents.

sovereign ai japanese stockmark chat large language model

qwen qwen3-next-80b-a3b-thinking

80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.

Reasoning chat Text-to-Text

microsoft TRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

text-to-3d Run-on-RTX image-to-3d

deepseek-ai deepseek-v3.1

DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.

Reasoning chat Text-to-Text

stabilityai stable-diffusion-3.5-large

Stable Diffusion 3.5 is a popular text-to-image generation model

Image Generation Text-to-Image

openai gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

text-to-text chat reasoning math

openai gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

text-to-text chat reasoning math

nvidia parakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

ASR English NVIDIA NIM NVIDIA Riva speech-to-text

opengpt-x teuken-7b-instruct-commercial-v0.4

Multilingual 7B LLM, instruction-tuned on all 24 EU languages for stable, culturally aligned output.

sovereign ai text-to-text chat european Multilingual

nvidia sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

autonomous vehicles bev av stack automotive

mistralai mixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced Reasoning chat Code Generation Text-to-Text Large Language Models

mistralai mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced Reasoning chat Code Generation Text-to-Text Large Language Models

thudm chatglm3-6b

Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.

Text Translation chat Code Generation Text-to-Text Regional Language Generation

nvidia magpie-tts-flow

Expressive and engaging text-to-speech, generated from a short audio sample.

TTS Text-to-Speech NVIDIA NIM NVIDIA Riva

nvidia nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Data ingestion Chart Detection nemo retriever Table Detection run-on-rtx extraction

meta llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation Run-on-RTX

qwen qwen2.5-coder-32b-instruct

Advanced LLM for code generation, reasoning, and fixing across popular programming languages.

code completion code generation chat text-to-code

meta llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal Safety Content Safety Guardrail Content Moderator

nvidia cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

Synthetic Data Generation Autonomous Vehicles Physical AI robotics video-to-world

nvidia llama-3.2-nemoretriever-500m-rerank-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nemo retriever Retrieval Augmented Generation reranking

gotocompany gemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

Sovereign AI chat Indonesian Text-to-Text Regional Language Generation

utter-project eurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

Sovereign AI chat Text-to-Text Multilingual European Regional Language Generation

nvidia audio2face-3d

Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.

Speech-to-Animation Digital Humans Audio-to-Face NVIDIA NIM

nvidia Background Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

Nvidia Maxine Speech-to-speech Digital Human Speech Enhancement

nvidia studiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

Nvidia Maxine Speech-to-speech Digital Human Run-on-RTX Speech Enhancement

meta llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

Reasoning chat Code Generation Text-to-Text Instruction following Math

meta llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

chat Code Generation Text-to-Text Language Generation

nvidia magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

TTS Text-to-Speech NVIDIA NIM NVIDIA Riva

mistralai mistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

chat Text-to-Text Language Generation

arc evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

DNA Generation biology nim Bionemo Drug Discovery

microsoft phi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chat Code Generation Text-to-Text Language Generation Large Language Models

microsoft phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chat Code Generation Text-to-Text Language Generation

nvidia llama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-text chat Non-Commercial Use Only

google gemma-2-9b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chat Code Generation Text-to-Text Language Generation

mediatek breeze-7b-instruct

LLM for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

chat Text-to-Text Regional Language Generation

google gemma-2-27b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chat Code Generation Text-to-Text Language Generation

google gemma-3-1b-it

A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications

Translation chat Text-to-Text Language Generation

google gemma-2-2b-it

Advanced small language generative AI model for edge applications

chat Code Generation Text-to-Text Language Generation

baichuan-inc baichuan2-13b-chat

Support Chinese and English chat, coding, math, instruction following, solving quizzes

Chinese Language Generation Text Translation chat Text-to-Text

abacusai dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

chat Code Generation Text-to-Text

rakuten rakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat Text-to-Text Language Generation Large Language Models

microsoft phi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

rakuten rakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat Text-to-Text Language Generation Large Language Models

microsoft phi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chat Code Generation Text-to-Text Language Generation Large Language Models

qwen qwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language Generation chat Text-to-Text Large Language Models

microsoft phi-3-small-8k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

qwen qwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

Chinese Language Generation chat Text-to-Text Large Language Models

microsoft phi-3-medium-4k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

qwen qwen2.5-coder-7b-instruct

Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.

code completion code generation chat text-to-code

microsoft phi-3-medium-128k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat Code Generation Text-to-Text Language Generation Large Language Models

meta llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

Indic chat Text-to-Text Language Generation

mistralai mistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

chat Text-to-Text Language Generation NVIDIA NIM

meta llama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation Large Language Models

meta llama3-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

chat Large Language models Code Generation Text-to-Text Language Generation

ai21labs jamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chat Language Generation Text-to-text

meta llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chat Code Generation Text-to-Text Language Generation

google gemma-7b

Cutting-edge text generation model text understanding, transformation, and code generation.

chat Code Generation Text-to-Text Language Generation

nvidia canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Automatic Speech Recognition Automatic Speech Translation NVIDIA NIM NVIDIA Riva

upstage solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

Non-Commercial Use Only chat Text-to-Text Language Generation Large Language Models

nvidia llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

nemo guardrails LLM safety Safety and moderation dialogue safety nemotron

nvidia eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

telepresence Nvidia Maxine Digital Human

nvidia cosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

Synthetic Data Generation Physical AI policy evaluation robotics video-to-world

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object Detection Chart Detection nemo retriever Table Detection data ingestion run-on-rtx

nvidia cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

VLM Vision language model image caption image to text

microsoft phi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chat Code Generation Text-to-Text Language Generation Large Language Models

ibm granite-guardian-3.0-8b

Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior

Guardrail Text-to-text

google shieldgemma-9b

Guardrail model to ensure that responses from LLMs are appropriate and safe

Guardrail Text-to-Text

aisingapore sea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

Chat Text-to-Text Regional Language Generation Large Language Models

nvidia rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

Ranking Retrieval Augmented Generation

university-at-buffalo cached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retriever Chart Element Detection Image-To-Text

nvidia llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

Text-to-text Reward Model RLHF

stabilityai stable-diffusion-3-medium

Advanced text-to-image model for generating high quality images

Image Generation Text-to-Image

nvidia usdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

OpenUSD Synthetic Data Generation Digital Twin USD Text-to-3D

nvidia visual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

image Image Generation cv Image Segmentation vlm computer vision TAO Toolkit video NVIDIA NIM

nvidia retail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

Object Detection image cv vlm computer vision TAO Toolkit video NVIDIA NIM

nvidia ocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

Optical Character Recognition image Optical Character Detection cv vlm computer vision TAO Toolkit video

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chat Text-to-Text Language Generation

nvidia mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generation text-to-text chat small language model

google paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

image cv Vision Assistant vlm Visual Question Answering computer vision Language Generation Image-to-Text video