NVIDIA
Explore
Models
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

Search Results

Searching for: NVIDIA AI
Sorting by Most Recent

nvidiastreampetr

StreamPETR offers efficient 3D object detection for autonomous driving by propagating sparse object queries temporally.

autonomous vehiclesbevAV Stackautomotive

nvidiaCosmos Dataset Search

Accelerate post-training of end-to-end autonomous vehicle stacks with vector search and retrieval for large video datasets.

blueprintAutonomous VehiclesdataPhysical AISearchEnterpriseCosmosNVIDIA AI

nvidiaAmbient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

agent blueprintblueprintnimLaunchablenemollmNVIDIA AI

nvidianemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

text and table extractiondocument parsingsupported language - english

nvidianemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

language generationchatImage-to-Textvision assistantvisual question answering

nvidiallama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

content moderationllm safetymultilingual guard modelmultilingual content safetynemoguard

h2oFlood Intelligence

H2O.ai Flood Intelligence provides real-time, scalable intelligence for AI-powered disaster management.

LaunchableBlueprintRisk AnalysisPartnerNVIDIA AI

nvidiaparakeet-ctc-0.6b-zh-tw

Record-setting accuracy and performance for Mandarin Taiwanese English transcriptions.

ASRStreamingTaiwaneseSpeech-to-TextNVIDIA NIM

deepseek-aideepseek-v3.1-terminus

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

tool callingchatadvanced reasoningagentic

nvidiallama-3_2-nemoretriever-300m-embed-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Retrieval Augmented GenerationText-to-EmbeddingNeMo Retriever

stockmarkstockmark-2-100b-instruct

Japanese-specialized large-language-model for enterprises to read and understand complex business documents.

sovereign aijapanesestockmarkchatlarge language model

qwenqwen3-next-80b-a3b-instruct

Qwen3-Next Instruct blends hybrid attention, sparse MoE, and stability boosts for ultra-long context AI.

chattext-generationagentic

speakleashbielik-11b-v2.6-instruct

State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.

PolishSovereign AIchatChatbotsSummarization

qwenqwen3-next-80b-a3b-thinking

80B parameter AI model with hybrid reasoning, MoE architecture, support for 119 languages.

ReasoningchatText-to-Text

nvidiaparakeet-ctc-0.6b-zh-cn

Record-setting accuracy and performance for Mandarin English transcriptions.

ASRStreamingSpeech-to-TextMandarinNVIDIA NIM

nvidiaparakeet-ctc-0.6b-es

Accurate and optimized Spanish English transcriptions with punctuation and word timestamps.

ASRStreamingSpeech-to-TextSpanishNVIDIA NIM

nvidiaparakeet-ctc-0.6b-vi

Accurate and optimized Vietnamese-English transcriptions with punctuation and word timestamps.

ASRStreamingSpeech-to-TextVietnameseNVIDIA NIM

nvidia3D Object Generation

Transform your scene idea into ready-to-use 3D assets using Llama 3.1 8B, NV SANA, and Microsoft TRELLIS

BlueprintRun-on-RTXNVIDIA AI

microsoftTRELLIS

MSFT TRELLIS is a 3D AI model that generates high-quality 3D assets from text or image inputs.

text-to-3dRun-on-RTXimage-to-3d

nvidiaRetail Shopping Assistant

Elevate Shopping Experiences Online and In Stores.

blueprintnemo retrievernimLaunchableRetrieval-Augmented GenerationNVIDIA AI

deepseek-aideepseek-v3.1

DeepSeek V3.1 Instruct is a hybrid AI model with fast reasoning, 128K context, and strong tool use.

ReasoningchatText-to-Text

nvidianvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

thinking budgetchatreasoning

nvidiacosmos-reason1-7b

Reasoning vision language model (VLM) for physical AI and robotics.

video understandingSynthetic Data Generationautonomous vehiclesindustrialPhysical AIvision language modelreasoningroboticssmart cities

nvidianemoretriever-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

Optical Character RecognitionTable Extractionnemo retrieverdata ingestionextraction

openaigpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

text-to-textchatreasoningmath

nvidiaparakeet-tdt-0.6b-v2

Accurate and optimized English transcriptions with punctuation and word timestamps

ASREnglishNVIDIA NIMNVIDIA Rivaspeech-to-text

nvidiaStreaming Data to RAG

Sensor-captured radio enables real-time awareness, AI-driven analytics for actionable, searchable insights.

blueprintNIMRivaLaunchableRAGNVIDIA AINeMo Retriever

nvidiallama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

chatmathadvanced reasoninginstruction followingfunction calling

opengpt-xteuken-7b-instruct-commercial-v0.4

Multilingual 7B LLM, instruction-tuned on all 24 EU languages for stable, culturally aligned output.

sovereign aitext-to-textchateuropeanMultilingual

nvidiallama-3_2-nemoretriever-300m-embed-v1

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Retrieval Augmented GenerationText-to-EmbeddingNeMo Retriever

nvidianemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

Optical Character RecognitionTable Extractionnemo retrieverdata ingestionextraction

nvidiamagpie-tts-flow

Expressive and engaging text-to-speech, generated from a short audio sample.

TTSText-to-SpeechNVIDIA NIMNVIDIA Riva

nvidiariva-translate-4b-instruct

Translation model in 12 languages with few-shots example prompts capability.

Text Translationchat

metallama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

LLM Multimodal SafetyContent SafetyGuardrailContent Moderator

nvidiariva-translate-1.6b

Enable smooth global interactions in 36 languages.

Text TranslationNeural machine translationNVIDIA NIM

googlegemma-3n-e4b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generationspeech recognitionVisual QAchat

googlegemma-3n-e2b-it

An edge computing AI model which accepts text, audio and image input, ideal for resource-constrained environments

language generationspeech recognitionVisual QAchat

nvidiallama-3.2-nemoretriever-500m-rerank-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nemo retrieverRetrieval Augmented Generationreranking

nvidiacosmos-transfer1-7b

Generates physics-aware video world states for physical AI development using text prompts and multiple spatial control inputs derived from real-world data or simulation.

Synthetic Data GenerationAutonomous VehiclesPhysical AIroboticsvideo-to-world

nvidiaBackground Noise Removal

Removes unwanted noises from audio improving speech intelligibility.

Nvidia MaxineSpeech-to-speechDigital HumanSpeech Enhancement

nvidiallama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

nemo retrieverembeddingRetrieval Augmented GenerationText-to-Embedding

nvidiaMulti-LLM NIM

Use the multi-LLM compatible NIM container to deploy a broad range of LLMs from Hugging Face.

BlueprintNVIDIA AI

nvidiaBiomedical AI-Q Research Agent Blueprint

Build advanced AI agents within the biomedical domain using the AI-Q Blueprint and the BioNeMo Virtual Screening Blueprint

LaunchableAgent BlueprintBlueprintRetrieval-augmented generationllm

nvidiaRefine AI Agents through Continuous Model Distillation with Data Flywheels

Build a data flywheel, with NVIDIA NeMo microservices, that continuously optimizes AI agents for latency and cost — while maintaining accuracy targets.

NIMLaunchableData FlywheelBlueprintEnterpriseNeMo microservicesNVIDIA AI

wandbAI Observability for Data Flywheel

Streamline evaluation, monitoring, and optimization of AI data flywheel with Weights & Biases.

LaunchableAI AgentsData FlywheelWandBBlueprintObservabilityPartnerNVIDIA AI

iguazioAI Orchestration for Data Flywheel

Orchestrate AI agents for data flywheel with MLRun and NVIDIA NeMo microservices.

OrchestrationLaunchableAI AgentsData FlywheelBlueprintPartnerNVIDIA AI

nvidiaSafety for Agentic AI

Improve safety, security, and privacy of AI systems at build, deploy and run stages.

securityLaunchableBlueprintsafetyprivacyNemo Guardrailsopen modelsNVIDIA AI

nvidiaAI Agent for Telecom Network Configuration Planning

Automate and optimize the configuration of radio access network (RAN) parameters using agentic AI and a large language model (LLM)-driven framework.

nimLaunchableBlueprintsimulationTelecommunicationsNVIDIA AI

deepseek-aideepseek-r1-0528

Updated version of DeepSeek-R1 with enhanced reasoning, coding, math, and reduced hallucination.

codingchatmathadvanced reasoning

nvidiallama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

doc intelligencechatmultiple image understandingOCR

nvidiaFinancial Fraud Detection

Detect and prevent sophisticated fraudulent activities for financial services with high accuracy.

Financial ServicesLaunchableBlueprintGNNPaymentsNVIDIA AIFraud Detection

nvidiallama-3.1-nemotron-nano-4b-v1.1

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

edgetool callingchatreasoningmath

nvidiamagpie-tts-zeroshot

Expressive and engaging text-to-speech, generated from a short audio sample.

TTSText-to-SpeechNVIDIA NIMNVIDIA Riva

qwenqwen3-235b-a22b

Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following

chatcomplex mathadvanced reasoninginstruction following

black-forest-labsFLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

Image GenerationText-to-ImageRun-on-RTX

nvidiaBuild Digital Twins for AI Factory Design and Operations

Design, test, and optimize a new generation of intelligence manufacturing data centers using digital twins.

AI FactoryIndustrialNVIDIA OmniverseBlueprintsimulationEnterprise

utter-projecteurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

Sovereign AIchatText-to-TextMultilingualEuropeanRegional Language Generation

gotocompanygemma-2-9b-cpt-sahabatai-instruct

SOTA LLM pre-trained for instruction following and proficiency in Indonesian language and its dialects.

Sovereign AIchatIndonesianText-to-TextRegional Language Generation

mistralaimistral-small-3.1-24b-instruct-2503

Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses

language generationchatmultimodalimage understanding

nvidiaparakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

ASRStreamingSpeech-to-TextMultilingualNVIDIA NIM

nvidia3D Guided Generative AI

Create high quality images using Flux.1 in ComfyUI, guided by 3D.

BlueprintRun-on-RTXNVIDIA AI

nvidiallama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

chatmathadvanced reasoninginstruction followingfunction calling

nvidiaBuild an AI Agent for Enterprise Research

Build a custom enterprise research assistant powered by state-of-the-art models that process and synthesize multimodal data, enabling reasoning, planning, and refinement to generate comprehensive reports.

NIMLaunchableLlama NemotronReasoningBlueprintEnterpriseRetrieval-Augmented GenerationNVIDIA AINeMo Retriever

nvidiaAI Weather Analytics with Earth-2

Develop AI powered weather analysis and forecasting application visualizing multi-layered geospatial data.

BlueprintClimate ScienceEnterpriseWeather SimulationAI Weather PredictionNVIDIA AIEarth-2

nvidiaSingle Cell Analysis

Investigate, understand, and interpret single cell data in minutes, not days by leveraging RAPIDS-singlecell, powered by NVIDIA RAPIDS

RAPIDSRNA SequencingLaunchableBlueprintGenomicsSingle CellBiologyNVIDIA AI

nvidiaGenomics Analysis

Easily run essential genomics workflows to save time leveraging Parabricks

ParabricksLaunchableBlueprintGenomicsBiologyDNA SequencingNVIDIA AI

siemenssimcenter-star-ccm+

Run computational-fluid dynamics (CFD) simulations

aerodynamicscaefluid-dynamicssimulationheat-transfercomputer-aided engineering

cadencefidelity

Run computational-fluid dynamics (CFD) simulations

aerodynamicscaefluid-dynamicssimulationheat-transfercomputer-aided engineering

ansysfluent

Run computational-fluid dynamics (CFD) simulations

aerodynamicscaefluid-dynamicssimulationheat-transfercomputer-aided engineering

nvidiaSynthetic Manipulation Motion Generation for Robotics

Generate exponentially large amounts of synthetic motion trajectories for robot manipulation from just a few human demonstrations.

NVIDIA OmniverseBlueprintsynthetic dataEnterpriseroboticsphysical airobot learningHumanoidsNVIDIA Isaac GR00Ttext-to-worldimage-to-worldteleop

nvidiacosmos-predict1-5b

Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.

Synthetic Data GenerationPhysical AIpolicy evaluationroboticsvideo-to-world

nvidiaTest Multi-Robot Fleets for Industrial Automation

Simulate, test, and optimize physical AI and robotic fleets at scale in industrial digital twins before real-world deployment.

industrialNVIDIA OmniverseBlueprintsimulationEnterpriseomniverse blueprint

nvidiasparsedrive

End-to-end autonomous driving stack integrating perception, prediction, and planning with sparse scene representations for efficiency and safety.

autonomous vehiclesbevav stackautomotive

nvidiabevformer

Advanced transformer for multi-frame bird's-eye-view 3D perception in autonomous driving.

autonomous vehiclesbevautomotiveperception

nvidiallama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

chatmathadvanced reasoninginstruction followingfunction calling

nvidiallama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

chatmathadvanced reasoninginstruction followingfunction calling

nvidiamagpie-tts-multilingual

Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.

TTSText-to-SpeechNVIDIA NIMNVIDIA Rivamultilingual

nvidianv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

nemo retrieverEmbeddingRetrieval Augmented Generation

deepseek-aideepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

Distillationcodingchatreasoningrun-on-rtxmath

nvidianemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object DetectionChart Detectionnemo retrieverTable Detectiondata ingestionrun-on-rtx

nvidianemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object DetectionChart Detectionnemo retrieverTable Detectiondata ingestionrun-on-rtx

nvidianemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object DetectionChart Detectionnemo retrieverTable Detectiondata ingestionrun-on-rtx

nvidianemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

optical character recognitionnemo retrieverdata ingestiontable extractionsupported language - english

nvidiaLLM Router

Route LLM requests to the best model for the task at hand.

LaunchableBlueprintLLM RouterNVIDIA AI

deepseek-aideepseek-r1-distill-qwen-32b

Distilled version of Qwen 2.5 32B using reasoning data generated by DeepSeek R1 for enhanced performance.

codingdistillationchatreasoningmath

deepseek-aideepseek-r1-distill-qwen-14b

Distilled version of Qwen 2.5 14B using reasoning data generated by DeepSeek R1 for enhanced performance.

codingdistillationchatreasoningmath

deepseek-aideepseek-r1-distill-qwen-7b

Distilled version of Qwen 2.5 7B using reasoning data generated by DeepSeek R1 for enhanced performance.

codingdistillationchatmath

microsoftphi-4-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chatCode GenerationText-to-TextLanguage Generation

nvidiaEvo 2 Protein Design

This workflow shows how generative AI can generate DNA sequences that can be translated into proteins for bioengineering.

blueprintNIMbiologyBioNemoDrug DiscoveryProtein Generation

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

ASRASTSpeech-to-TextbatchwhisperOpenAIMultilingualNVIDIA NIMNVIDIA Riva

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

Automatic Speech RecognitionAutomatic Speech TranslationNVIDIA NIMNVIDIA Riva

deepseek-aideepseek-r1

State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.

chatMathadvanced reasoning

nvidiaBuild an Enterprise RAG Pipeline Blueprint

Power fast, accurate semantic search across multimodal enterprise data with NVIDIA’s RAG Blueprint—built on NeMo Retriever and Nemotron models—to connect your agents to trusted, authoritative sources of knowledge.

NIMLaunchableNemotronBlueprintEnterpriseRetrieval-Augmented GenerationNVIDIA AINeMo Retriever

nvidiallama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

Dialogue SafetyLLM safetyGuard ModelContent safety

nvidianemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

LLM SecurityJailbreak DetectionPrompt InjectionNVIDIA NIM

nvidiallama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

LLM safetycontent moderationGuard modelContent safety

igeniuscolosseum_355b_instruct_16k

NVIDIA DGX Cloud trained multilingual LLM designed for mission critical use cases in regulated industries including financial services, government, heavy industry

Heavy industryGovernmentchatHighly regulated use case supportFinancial services

nvidiaBuild A Generative Protein Binder Design Pipeline

This blueprint shows how generative AI and accelerated NIM microservices can design protein binders smarter and faster.

NVIDIA BioNemoBlueprintEnterpriseBioNemoBiologyDrug DiscoveryProtein Generation

nvidiagenmol

Fragment-Based Molecular Generation by Discrete Diffusion.

ChemistrynimBioNemoMolecule GenerationDrug Discovery

nvidiaPDF to Podcast

Transform PDFs into AI podcasts for engaging on-the-go audio content.

blueprintMulti-modalLaunchableText-to-SpeechConversational AIPDF-to-PodcastNVIDIA AIAI Agent

wandbTraceability for Agentic AI

Trace and evaluate AI Agents with Weights & Biases.

TraceabilityLaunchableAI AgentsWandBBlueprintPartnerNVIDIA AI

pipecatVoice Agent Framework for Conversational AI

Automate voice AI agents with NVIDIA NIM microservices and Pipecat.

PipecatLaunchableAI AgentsBlueprintConversational AIPartnerNVIDIA AI

langchainStructured Report Generation

Generate detailed, structured reports on any topic using LangGraph and Llama3.3 70B NIM.

LangGraphReport GenerationLaunchableAI AgentsBlueprintPartnerNVIDIA AI

crewaiCode Documentation for Software Development

Document your github repositories with AI Agents using CrewAI and Llama3.3 70B NIM.

Code DocumentationCrewAILaunchableAI AgentsBlueprintPartnerNVIDIA AI

nvidiacosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

VLMVision language modelimage captionimage to text

nvidiallama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

nemo retrieverrun-on-rtxembeddingRetrieval Augmented GenerationText-to-Embedding

nvidiallama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

nemo retrieverrun-on-rtxRetrieval Augmented Generationreranking

nvidiausdcode

State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

OpenUSDSynthetic Data GenerationDigital TwinchatCode Generation

university-at-buffalocached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retrieverChart Element DetectionImage-To-Text

nvidianv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Object DetectionData ingestionChart Detectionnemo retrieverTable Detectionrun-on-rtxextraction

baidupaddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

Optical Character RecognitionTable ExtractionOptical Character Detectionnemo retrieverdata ingestionrun-on-rtxextraction

nvidiaaudio2face-3d

Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.

Speech-to-AnimationDigital HumansAudio-to-FaceNVIDIA NIM

nvidiaBuild a Digital Twin for Interactive Fluid Simulation

This NVIDIA Omniverse™ Blueprint demonstrates how commercial software vendors can create interactive digital twins.

NVIDIA OmniverseBlueprintCAEsimulationExternal AerodynamicsEnterpriseComputer-aided-engineering

nvidiacorrdiff

Generative downscaling model for generating high resolution regional scale weather fields.

AI Weather predictionWeather SimulationEarth-2

nvidiafourcastnet

FourCastNet predicts global atmospheric dynamics of various weather / climate variables.

Weather SimulationAI Weather PredictionClimate scienceEarth-2

hivedeepfake-image-detection

Advanced AI model detects faces and identifies deep fake images.

computer visionAI safetydeep fake detectionContent moderation

nvidiaBuild a Video Search and Summarization (VSS) Agent

Ingest massive volumes of live or archived videos and extract insights for summarization and interactive Q&A

visionvideo-to-textgenerative AILaunchableBlueprintchatEnterpriseNVIDIA AI

nvidia3D Conditioning for Precise Visual Generative AI

Enhance and modify high-quality compositions using real-time rendering and generative AI output without affecting a hero product asset.

visual designNVIDIA OmniverseBlueprintsimulationEnterprise

nvidiaBuild an AI Virtual Assistant

Create intelligent virtual assistants for customer service across every industry

Customer ServiceLaunchableBlueprintRetrieval-augmented generationllmcontact centerNVIDIA AI

nvidianemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

IndicchatText-to-TextLanguage Generation

nvidiaVulnerability Analysis for Container Security

Rapidly identify and mitigate container security vulnerabilities with generative AI.

generative aiLaunchablenv-embedqa-e5-v5Blueprintllama-3_1-70b-instructcybersecurityNVIDIA AI

institute-of-science-tokyollama-3.1-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

Sovereign AILarge Language ModelchatRegional Language Generation

institute-of-science-tokyollama-3.1-swallow-8b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

Sovereign AILarge Language ModelchatRegional Language Generation

nvidiastudiovoice

Enhance speech by correcting common audio degradations to create studio quality speech output.

Nvidia MaxineSpeech-to-speechDigital HumanRun-on-RTXSpeech Enhancement

nvidiallama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

Text-to-textReward ModelRLHF

nvidiavila

Multi-modal vision-language model that understands text/img/video and creates informative responses

VLMVision language modelimage captionimage to text

hiveai-generated-image-detection

Robust image classification model for detecting and managing AI-generated content.

image classificationcomputer visionAI safetyContent moderation

yentinglinllama-3-taiwan-70b-instruct

Sovereign AI model finetuned on Traditional Mandarin and English data using the Llama-3 architecture.

regional language generationchatCode GenerationLarge Language Models

tokyotech-llmllama-3-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

Large Language ModelchatRegional Language Generation

nvidiaBuild A Generative Virtual Screening Pipeline

This blueprint shows how generative AI and accelerated NIM microservices can design optimized small molecules smarter and faster.

ChemistryNIMNVIDIA BioNemoBlueprintEnterpriseBioNemoDockingDrug Discovery

ai21labsjamba-1.5-mini-instruct

Cutting-edge MOE based LLM designed to excel in a wide array of generative AI tasks.

chatLanguage GenerationText-to-text

nvidianemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

chatText-to-TextLanguage Generation

nvidiamistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generationtext-to-textchatsmall language model

microsoftphi-3.5-mini-instruct

Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments

chatCode GenerationText-to-TextLanguage GenerationLarge Language Models

nvidianv-dinov2

NV-DINOv2 is a visual foundation model that generates vector embeddings for the input image.

Image-to-Embeddingcomputer visiondeepstreamNVIDIA NIMobject Classification

nvidianv-grounding-dino

Grounding dino is an open vocabulary zero-shot object detection model.

Object Detectioncomputer visiondeepstreamNVIDIA NIM

nvidiamegatron-1b-nmt

Enable smooth global interactions in 36 languages.

Text TranslationNeural machine translationNVIDIA NIM

nvidiaparakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

ASRStreamingEnglishSpeech-to-TextbatchNVIDIA NIM

nvidiaparakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

ASRStreamingEnglishBatchSpeech-to-TextFastNVIDIA NIMRun-on-RTX

googlegemma-2-2b-it

Advanced small language generative AI model for edge applications

chatCode GenerationText-to-TextLanguage Generation

nvidiausdsearch

AI-powered search for OpenUSD data, 3D models, images, and assets using text or image-based inputs.

OpenUSDSynthetic Data GenerationDigital TwinUSDText-to-3D

nvidiaeyecontact

Estimate gaze angles of a person in a video and redirect to make it frontal.

telepresenceNvidia MaxineDigital Human

nvidiausdvalidate

Verify compatibility of OpenUSD assets with instant RTX render and rule-based validation.

ValidationOpenUSDSynthetic Data GenerationDigital TwinUSDVisualization 3D

nvidianv-rerankqa-mistral-4b-v3

Multilingual text reranking model.

nemo retrieverRerankingRetrieval Augmented Generation

nvidianv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

Embeddingrun-on-rtxRetrieval Augmented GenerationNemo retrieverText-to-Embedding

nvidianv-embedqa-mistral-7b-v2

Multilingual text question-answering retrieval, transforming textual information into dense vector representations.

nemo retrieverEmbeddingRetrieval Augmented Generation

nvidiamaisi

MAISI is a pre-trained volumetric (3D) CT Latent Diffusion Generative Model.

Image GenerationMedical ImagingNVIDIA NIM

nvidiallama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

text-to-textchatNon-Commercial Use Only

nvidianvclip

NV-CLIP is a multimodal embeddings model for image and text.

Computer visionmultimodal embeddingstext and imageRun-on-rtx

nvidiaocdrnet

OCDNet and OCRNet are pre-trained models designed for optical character detection and recognition respectively.

Optical Character RecognitionimageOptical Character Detectioncvvlmcomputer visionTAO Toolkitvideo

nvidianv-embed-v1

Generates high-quality numerical embeddings from text inputs.

Non-Commercial Use OnlyRetrieval Augmented GenerationText-to-Embedding

nvidiavisual-changenet

Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask

imageImage GenerationcvImage Segmentationvlmcomputer visionTAO ToolkitvideoNVIDIA NIM

nvidiaretail-object-detection

EfficientDet-based object detection network to detect 100 specific retail objects from an input video.

Object Detectionimagecvvlmcomputer visionTAO ToolkitvideoNVIDIA NIM

googlepaligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

imagecvVision AssistantvlmVisual Question Answeringcomputer visionLanguage GenerationImage-to-Textvideo

aisingaporesea-lion-7b-instruct

LLM to represent and serve the linguistic and cultural diversity of Southeast Asia

ChatText-to-TextRegional Language GenerationLarge Language Models

mistralaimixtral-8x22b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced ReasoningchatCode GenerationText-to-TextLarge Language Models

nvidiarerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

RankingRetrieval Augmented Generation

nvidiavista-3d

VISTA-3D is a specialized interactive foundation model for segmenting and anotating human anatomies.

Interactive AnnotationImage SegmentationNon-Commercial Use OnlyMedical Imaging

mistralaimistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

chatText-to-TextLanguage GenerationNVIDIA NIM

nvidiamolmim

MolMIM performs controlled generation, finding molecules with the right properties.

ChemistrynimBioNemoMolecule GenerationDrug Discovery

mistralaimixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Advanced ReasoningchatCode GenerationText-to-TextLarge Language Models

nvidiacuopt

World-record accuracy and performance for complex route optimization.

Route Optimization