NVIDIA
Explore Models Blueprints GPUs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

Search Results

Searching for: TAO Toolkit
Sorting by Most Recent

nvidiamagpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

ttstext-to-speechnvidia nimnvidia rivanvidia

utter-projecteurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

sovereign aichatchattext-to-textmultilingualeuropeanregional language generationutter-project

gotocompanygemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

sovereign aichatindonesianchattext-to-textregional language generationgotocompany

nvidiaBuild an AI Agent for Research and Reporting

Create AI agents that reason, plan, reflect and refine to produce high-quality reports based on source materials of your choice.

nimllama nemotronreasoningblueprintenterpriseretrieval-augmented generationnvidia ainemo retrievernvidia

nvidiaGenomics Analysis

Easily run essential genomics workflows to save time leveraging Parabricks

parabrickslaunchableblueprintgenomicsbiologydna sequencingnvidia ainvidia

nvidiaSynthetic Manipulation Motion Generation for Robotics

Generate exponentially large amounts of synthetic motion trajectories for robot manipulation from just a few human demonstrations.

nvidia omniverseblueprintsynthetic dataenterpriseroboticsphysical airobot learninghumanoidsnvidia isaac gr00ttext-to-worldimage-to-worldteleopnvidia

nvidiacosmos-predict1-7b

Generates physics-aware video world states from text and image prompts for physical AI development.

synthetic data generationphysical airoboticstext-to-worldimage-to-worldnvidia

nvidiacosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

synthetic data generationphysical aipolicy evaluationroboticsvideo-to-worldnvidia

nvidiasparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

autonomous vehiclesbevav stackautomotivenvidia

nvidianemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detectionchart detectionnemo retrievertable detectiondata ingestionnvidia

nvidianemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detectionchart detectionnemo retrievertable detectiondata ingestionnvidia

nvidianemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detectionchart detectionnemo retrievertable detectiondata ingestionnvidia

googlegemma-3-1b-it

A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications

translationchatchattext-to-textlanguage generationgoogle

nvidiaLLM Router

Route LLM requests to the best model for the task at hand.

launchableblueprintllm routernvidia ainvidia

microsoftphi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generationchattext-to-textlanguage generationmicrosoft

arcevo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

dna generationbiologynimbionemodrug discoveryarc

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asraststreamingspeech-to-textbatchspanishmultilingualnvidia nimnvidia rivanvidia

nvidiacanary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastfastspeech-to-textbatchmultilingualnvidia nimnvidia rivanvidia

nvidiaBuild an Enterprise RAG pipeline

Connect AI applications to multimodal enterprise data with a scalable retrieval augmented generation (RAG) pipeline built on highly performant, industry-leading NIM microservices, for faster PDF data extraction and more accurate information retrieval.

nemo retrievernimlaunchableblueprintenterpriseretrieval-augmented generationnvidia ainvidia

nvidiallama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

dialogue safetyllm safetyguard modelcontent safetynvidia

qwenqwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

chinese language generationchattext-to-textlarge language modelsqwen

nvidiaPDF to Podcast

Transform PDFs into AI podcasts for engaging on-the-go audio content.

blueprintmulti-modallaunchabletext-to-speechconversational aipdf-to-podcastnvidia aiai agenttext-to-speechnvidia

nvidiacosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

vlmvision language modelimage captionimage to textnvidia

qwenqwen2.5-coder-32b-instruct

Advanced LLM for code generation, reasoning, and fixing across popular programming languages.

code completioncode generationchattext-to-codeqwen

qwenqwen2.5-coder-7b-instruct

Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.

code completioncode generationchattext-to-codeqwen

writerpalmyra-creative-122b

Powerful LLM designed for creative thinking and writing.

content generationchatchattext-to-textwriter

metallama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

reasoningcode generationtext-to-textinstruction followingmathmeta

university-at-buffalocached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retrieverchart element detectionimage-to-textuniversity-at-buffalo

nvidianv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detectiondata ingestionchart detectionnemo retrievertable detectionrun on rtxextractionnvidia

nvidiaaudio2face-3d

Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.

speech-to-animationdigital humansaudio-to-facenvidia nimnvidia

nvidiaBuild a Video Search and Summarization (VSS) Agent

Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A

visionvideo-to-textgenerative ailaunchableblueprintchatenterprisenvidia ainvidia

nvidianemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

indicchatchattext-to-textlanguage generationnvidia

ibmgranite-guardian-3.0-8b

Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior

guardrailtext-to-textibm

nvidiallama-3.1-nemotron-70b-instruct

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA in order to improve the helpfulness of LLM generated responses.

chatcode generationchattext-to-textlanguage generationnvidia

zyphrazamba2-7b-instruct

Efficient hybrid state-space model designed for conversational and reasoning tasks.

chatchatlanguage generationtext-to-textzyphra

nvidiastudiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

run on rtxnvidia maxinespeech-to-speechdigital humanspeech enhancementnvidia

nvidiamistral-nemo-minitron-8b-8k-instruct

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

small language modelchatcode generationchattext-to-textlanguage generationnvidia

nvidiallama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

text-to-textreward modelrlhfnvidia

metallama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chatcode generationchattext-to-textlanguage generationmeta

metallama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chatcode generationtext-to-textlanguage generationmeta

nvidiallama-3.1-nemotron-51b-instruct

Unique language model that delivers an unmatched accuracy-efficiency performance.

chatlanguage generationchattext-to-textnvidia

qwenqwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

chinese language generationchatchattext-to-textlarge language modelsqwen

abacusaidracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

chatcode generationtext-to-textabacusai

nvidiavila

Multi-modal vision-language model that understands text/img/video and creates informative responses

vlmvision language modelimage captionimage to textnvidia

nvidiaBuild a Digital Human

Create intelligent, interactive avatars for customer service across industries

digital humansspeech-to-textnvidia omniverseblueprintenterprisechataudio-to-facenvidia ainvidia

ai21labsjamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chatchatlanguage generationtext-to-textai21labs

ai21labsjamba-1.5-large-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chatchatlanguage generationtext-to-textai21labs

nvidianemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chattext-to-textlanguage generationnvidia

nvidiamistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generationtext-to-textchatsmall language modelnvidia

microsoftphi-3.5-moe-instruct

Advanced LLM based on Mixture of Experts architecure to deliver compute efficient content generation

moechatcode generationchattext-to-textlanguage generationmicrosoft

microsoftphi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generationchattext-to-textlanguage generationlarge language modelsmicrosoft

rakutenrakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatchattext-to-textlanguage generationlarge language modelsrakuten

rakutenrakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatchattext-to-textlanguage generationlarge language modelsrakuten

briaaiBRIA-2.3

An enterprise-grade text-to-image model trained on a compliant dataset produces high quality images.

image generationtext-to-imagebriaai

nvidiaradtts-hifigan-tts

Natural, high-fidelity, English voices for personalizing text-to-speech services and voiceovers

text-to-speechtext-to-speechnvidia nimnvidia

writerpalmyra-fin-70b-32k

Specialized LLM for financial analysis, reporting, and data processing

chatfinancetext-to-textwriter

googleshieldgemma-9b

Guardrail model to ensure that responses from LLMs are appropriate and safe

guardrailtext-to-textgoogle

googlegemma-2-2b-it

Advanced small language generative AI model for edge applications

chatcode generationchattext-to-textlanguage generationgoogle

nvidiausdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

openusdsynthetic data generationdigital twinusdtext-to-3dnvidia nimnvidia

nvidiaeyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

telepresencenvidia maxinedigital humannvidia

thudmchatglm3-6b

Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.

text translationchatcode generationchattext-to-textregional language generationthudm

baichuan-incbaichuan2-13b-chat

Support Chinese and English chat, coding, math, instruction following, solving quizzes

chinese language generationtext translationchatchattext-to-textbaichuan-inc

metallama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

code generationchattext-to-textlanguage generationmeta

metallama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

run on rtxcode generationchattext-to-textlanguage generationmeta

nv-mistralaimistral-nemo-12b-instruct

Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU.

run on rtxcode generationchatlanguage generationtext-to-textnv-mistralai

microsoftphi-3-medium-128k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

googlegemma-2-27b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chatcode generationchattext-to-textlanguage generationgoogle

googlegemma-2-9b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chatcode generationtext-to-textlanguage generationgoogle

nvidiallama3-chatqa-1.5-70b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-textchatnon-commercial use onlychatnvidia

nvidiallama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-textchatnon-commercial use onlynvidia

01-aiyi-large

Powerful model trained on English and Chinese for diverse tasks including chatbot and creative writing.

chatcode generationchattext-to-textmultilingual01-ai

mistralaimistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

chattext-to-textlanguage generationmistralai

stabilityaistable-diffusion-3-medium

Advanced text-to-image model for generating high quality images

image generationtext-to-imagestabilityai

nvidiaocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

optical character recognitionimageoptical character detectioncvvlmcomputer visiontao toolkitvideonvidia

writerpalmyra-med-70b-32k

Leading LLM for accurate, contextually relevant responses in the medical domain.

chattext-to-texthealthcarewriter

writerpalmyra-med-70b

Leading LLM for accurate, contextually relevant responses in the medical domain.

chattext-to-texthealthcarewriter

upstagesolar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

non-commercial use onlychattext-to-textlanguage generationlarge language modelsupstage

mediatekbreeze-7b-instruct

LLM for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

chatchattext-to-textregional language generationmediatek

nvidiavisual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

imageimage generationcvimage segmentationvlmcomputer visiontao toolkitvideonvidia nimnvidia

nvidiaretail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

object detectionimagecvvlmcomputer visiontao toolkitvideonvidia nimnvidia

microsoftphi-3-small-8k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

microsoftphi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

microsoftphi-3-medium-4k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

googlepaligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

imagecvvision assistantvlmvisual question answeringcomputer visionlanguage generationimage-to-textvideogoogle

aisingaporesea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

chattext-to-textregional language generationlarge language modelsaisingapore

microsoftphi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

databricksdbrx-instruct

A general-purpose LLM with state-of-the-art performance in language understanding, coding, and RAG.

chatchattext-to-textlanguage generationlarge language modelsdatabricks

microsoftphi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

mistralaimixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoningchatcode generationchattext-to-textlarge language modelsmistralai

metallama3-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

chatlarge language modelscode generationchattext-to-textlanguage generationmeta

metallama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatcode generationchattext-to-textlanguage generationlarge language modelsmeta

googlerecurrentgemma-2b

Novel recurrent architecture based language model for faster inference when generating long sequences.

chatcode generationchattext-to-textlanguage generationgoogle

googlecodegemma-7b

Cutting-edge model built on Google's Gemma-7B specialized for code generation and code completion.

chatcode generationchatlanguage generationtext-to-codegoogle

googlegemma-2b

Lightweight language model deployable on laptop, desktop or the cloud for summarization and reasoning.

chatcode generationchattext-to-textlanguage generationgoogle

nvidiarerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

rankingretrieval augmented generationnvidia

microsoftkosmos-2

Groundbreaking multimodal model designed to understand and reason about visual elements in images.

imagecvmultimodalvlmvisual question answeringcomputer visionimage understandingimage-to-textvideomicrosoft

googlegemma-7b

Cutting-edge text generation model text understanding, transformation, and code generation.

chatcode generationchattext-to-textlanguage generationgoogle

mistralaimistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

chattext-to-textlanguage generationnvidia nimmistralai

stabilityaistable-video-diffusion

Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.

image generationtext-to-imagestabilityai

stabilityaisdxl-turbo

A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation

image generationtext-to-imagestabilityai

mistralaimixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoningchatcode generationchattext-to-textlarge language modelsmistralai