Try NVIDIA NIM APIs

Explore Models Blueprints GPUs Docs

Manage My Privacy

Contact

Search Results

Searching for: TAO Toolkit

Sorting by Most Recent

meta llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

llm multimodal safety content safety guardrail content moderator meta

nvidia llama-3.2-nemoretriever-500m-rerank-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nemo retriever retrieval augmented generation reranking nvidia

nvidia cosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

synthetic data generation autonomous vehicles physical ai robotics video-to-world nvidia

nvidia bnr

Removes unwanted noises from audio improving speech intelligibility.

nvidia maxine speech-to-speech digital human speech enhancement nvidia

nvidia Bring LLMs to NIM

Use NIM to deploy a broad range of LLMs from Hugging Face.

blueprint nvidia ai nvidia

nvidia magpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

tts text-to-speech nvidia nim nvidia riva nvidia

utter-project eurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

sovereign ai chat chat text-to-text multilingual european regional language generation utter-project

gotocompany gemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

sovereign ai chat indonesian chat text-to-text regional language generation gotocompany

nvidia Build an AI Agent for Enterprise Research

Build artificial general agents (AGA) powered by AGI models that continuously process and synthesize multimodal enterprise data, enabling reasoning, planning, and refinement to generate comprehensive reports.

nim launchable llama nemotron reasoning blueprint enterprise retrieval-augmented generation nvidia ai nemo retriever nvidia

nvidia Genomics Analysis

Easily run essential genomics workflows to save time leveraging Parabricks

parabricks launchable blueprint genomics biology dna sequencing nvidia ai nvidia

nvidia Synthetic Manipulation Motion Generation for Robotics

Generate exponentially large amounts of synthetic motion trajectories for robot manipulation from just a few human demonstrations.

nvidia omniverse blueprint synthetic data enterprise robotics physical ai robot learning humanoids nvidia isaac gr00t text-to-world image-to-world teleop nvidia

nvidia cosmos-predict1-7b

Generalist model to generate future world state as videos from text and image prompts to create synthetic training data for robots and autonomous vehicles.

synthetic data generation autonomous vehicles physical ai robotics text-to-world image-to-world nvidia

nvidia cosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

synthetic data generation physical ai policy evaluation robotics video-to-world nvidia

nvidia sparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

autonomous vehicles bev av stack automotive nvidia

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection chart detection nemo retriever table detection data ingestion nvidia

google gemma-3-1b-it

A lightweight, multilingual, advanced SLM text model for edge computing, resource constraint applications

translation chat chat text-to-text language generation google

nvidia LLM Router

Route LLM requests to the best model for the task at hand.

launchable blueprint llm router nvidia ai nvidia

microsoft phi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generation chat text-to-text language generation microsoft

arc evo2-40b

Evo 2 is a biological foundation model that is able to integrate information over long genomic sequences while retaining sensitivity to single-nucleotide changes.

dna generation biology nim bionemo drug discovery arc

nvidia canary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asr ast streaming speech-to-text batch spanish multilingual nvidia nim nvidia riva nvidia

nvidia canary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asr ast fast speech-to-text batch multilingual nvidia nim nvidia riva nvidia

nvidia Build an Enterprise RAG pipeline

Connect AI applications to multimodal enterprise data with a scalable retrieval augmented generation (RAG) pipeline built on highly performant, industry-leading NIM microservices, for faster PDF data extraction and more accurate information retrieval.

nemo retriever nim launchable blueprint enterprise retrieval-augmented generation nvidia ai nvidia

nvidia llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

dialogue safety llm safety guard model content safety nvidia

qwen qwen2.5-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

chinese language generation chat text-to-text large language models qwen

nvidia PDF to Podcast

Transform PDFs into AI podcasts for engaging on-the-go audio content.

blueprint multi-modal launchable text-to-speech conversational ai pdf-to-podcast nvidia ai ai agent text-to-speech nvidia

nvidia cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

vlm vision language model image caption image to text nvidia

qwen qwen2.5-coder-32b-instruct

Advanced LLM for code generation, reasoning, and fixing across popular programming languages.

code completion code generation chat text-to-code qwen

qwen qwen2.5-coder-7b-instruct

Powerful mid-size code model with a 32K context length, excelling in coding in multiple languages.

code completion code generation chat text-to-code qwen

writer palmyra-creative-122b

Powerful LLM designed for creative thinking and writing.

content generation chat chat text-to-text writer

meta llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

reasoning code generation text-to-text instruction following math meta

university-at-buffalo cached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retriever chart element detection image-to-text university-at-buffalo

nvidia nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

object detection data ingestion chart detection nemo retriever table detection run-on-rtx extraction nvidia

nvidia audio2face-3d

Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.

speech-to-animation digital humans audio-to-face nvidia nim nvidia

nvidia Build a Video Search and Summarization (VSS) Agent

Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A

vision video-to-text generative ai launchable blueprint chat enterprise nvidia ai nvidia

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

indic chat chat text-to-text language generation nvidia

ibm granite-guardian-3.0-8b

Detects jailbreaking, bias, violence, profanity, sexual content, and unethical behavior

guardrail text-to-text ibm

nvidia llama-3.1-nemotron-70b-instruct

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA in order to improve the helpfulness of LLM generated responses.

chat code generation chat text-to-text language generation nvidia

zyphra zamba2-7b-instruct

Efficient hybrid state-space model designed for conversational and reasoning tasks.

chat chat language generation text-to-text zyphra

nvidia studiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

nvidia maxine speech-to-speech digital human run-on-rtx speech enhancement nvidia

nvidia mistral-nemo-minitron-8b-8k-instruct

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

small language model chat code generation chat text-to-text language generation nvidia

nvidia llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

text-to-text reward model rlhf nvidia

meta llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chat code generation chat text-to-text language generation meta

meta llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chat code generation text-to-text language generation meta

nvidia llama-3.1-nemotron-51b-instruct

Unique language model that delivers an unmatched accuracy-efficiency performance.

chat language generation chat text-to-text nvidia

qwen qwen2-7b-instruct

Chinese and English LLM targeting for language, coding, mathematics, reasoning, etc.

chinese language generation chat chat text-to-text large language models qwen

abacusai dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

chat code generation text-to-text abacusai

nvidia vila

Multi-modal vision-language model that understands text/img/video and creates informative responses

vlm vision language model image caption image to text nvidia

nvidia Build a Digital Human

Create intelligent, interactive avatars for customer service across industries

digital humans speech-to-text nvidia omniverse blueprint enterprise chat audio-to-face nvidia ai nvidia

ai21labs jamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chat chat language generation text-to-text ai21labs

ai21labs jamba-1.5-large-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chat chat language generation text-to-text ai21labs

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chat text-to-text language generation nvidia

nvidia mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generation text-to-text chat small language model nvidia

microsoft phi-3.5-moe-instruct

Advanced LLM based on Mixture of Experts architecure to deliver compute efficient content generation

moe chat code generation chat text-to-text language generation microsoft

microsoft phi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

code generation chat text-to-text language generation large language models microsoft

rakuten rakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat chat text-to-text language generation large language models rakuten

rakuten rakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat chat text-to-text language generation large language models rakuten

briaai BRIA-2.3

An enterprise-grade text-to-image model trained on a compliant dataset produces high quality images.

image generation text-to-image briaai

nvidia radtts-hifigan-tts

Natural, high-fidelity, English voices for personalizing text-to-speech services and voiceovers

text-to-speech text-to-speech nvidia nim nvidia

writer palmyra-fin-70b-32k

Specialized LLM for financial analysis, reporting, and data processing

chat finance text-to-text writer

google shieldgemma-9b

Guardrail model to ensure that responses from LLMs are appropriate and safe

guardrail text-to-text google

google gemma-2-2b-it

Advanced small language generative AI model for edge applications

chat code generation chat text-to-text language generation google

nvidia usdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

openusd synthetic data generation digital twin usd text-to-3d nvidia nim nvidia

nvidia eyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

telepresence nvidia maxine digital human nvidia

thudm chatglm3-6b

Supports Chinese and English languages to handle tasks including chatbot, content generation, coding, and translation.

text translation chat code generation chat text-to-text regional language generation thudm

baichuan-inc baichuan2-13b-chat

Support Chinese and English chat, coding, math, instruction following, solving quizzes

chinese language generation text translation chat chat text-to-text baichuan-inc

meta llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

code generation chat text-to-text language generation meta

meta llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

code generation chat text-to-text language generation run-on-rtx meta

nv-mistralai mistral-nemo-12b-instruct

Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU.

code generation chat language generation text-to-text run-on-rtx nv-mistralai

microsoft phi-3-medium-128k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat code generation chat text-to-text language generation large language models microsoft

google gemma-2-27b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chat code generation chat text-to-text language generation google

google gemma-2-9b-it

Cutting-edge text generation model text understanding, transformation, and code generation.

chat code generation text-to-text language generation google

nvidia llama3-chatqa-1.5-70b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-text chat non-commercial use only chat nvidia

nvidia llama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-text chat non-commercial use only nvidia

01-ai yi-large

Powerful model trained on English and Chinese for diverse tasks including chatbot and creative writing.

chat code generation chat text-to-text multilingual 01-ai

mistralai mistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

chat text-to-text language generation mistralai

stabilityai stable-diffusion-3-medium

Advanced text-to-image model for generating high quality images

image generation text-to-image stabilityai

nvidia ocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

optical character recognition image optical character detection cv vlm computer vision tao toolkit video nvidia

writer palmyra-med-70b-32k

Leading LLM for accurate, contextually relevant responses in the medical domain.

chat text-to-text healthcare writer

writer palmyra-med-70b

Leading LLM for accurate, contextually relevant responses in the medical domain.

chat text-to-text healthcare writer

upstage solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

non-commercial use only chat text-to-text language generation large language models upstage

mediatek breeze-7b-instruct

LLM for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

chat chat text-to-text regional language generation mediatek

nvidia visual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

image image generation cv image segmentation vlm computer vision tao toolkit video nvidia nim nvidia

nvidia retail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

object detection image cv vlm computer vision tao toolkit video nvidia nim nvidia

microsoft phi-3-small-8k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat code generation chat text-to-text language generation large language models microsoft

microsoft phi-3-small-128k-instruct

Long context cutting-edge lightweight open language model exceling in high-quality reasoning.

chat code generation chat text-to-text language generation large language models microsoft

microsoft phi-3-medium-4k-instruct

Cutting-edge lightweight open language model exceling in high-quality reasoning.

chat code generation chat text-to-text language generation large language models microsoft

google paligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

image cv vision assistant vlm visual question answering computer vision language generation image-to-text video google

aisingapore sea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

chat text-to-text regional language generation large language models aisingapore

microsoft phi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chat code generation chat text-to-text language generation large language models microsoft

databricks dbrx-instruct

A general-purpose LLM with state-of-the-art performance in language understanding, coding, and RAG.

chat chat text-to-text language generation large language models databricks

microsoft phi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chat code generation chat text-to-text language generation large language models microsoft

mistralai mixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoning chat code generation chat text-to-text large language models mistralai

meta llama3-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

chat large language models code generation chat text-to-text language generation meta

meta llama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chat code generation chat text-to-text language generation large language models meta

google recurrentgemma-2b

Novel recurrent architecture based language model for faster inference when generating long sequences.

chat code generation chat text-to-text language generation google

google codegemma-7b

Cutting-edge model built on Google's Gemma-7B specialized for code generation and code completion.

chat code generation chat language generation text-to-code google

google gemma-2b

Lightweight language model deployable on laptop, desktop or the cloud for summarization and reasoning.

chat code generation chat text-to-text language generation google

nvidia rerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

ranking retrieval augmented generation nvidia

microsoft kosmos-2

Groundbreaking multimodal model designed to understand and reason about visual elements in images.

image cv multimodal vlm visual question answering computer vision image understanding image-to-text video microsoft

google gemma-7b

Cutting-edge text generation model text understanding, transformation, and code generation.

chat code generation chat text-to-text language generation google

mistralai mistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

chat text-to-text language generation nvidia nim mistralai

stabilityai stable-video-diffusion

Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.

image generation text-to-image stabilityai

stabilityai sdxl-turbo

A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation

image generation text-to-image stabilityai

mistralai mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

advanced reasoning chat code generation chat text-to-text large language models mistralai