NVIDIA
Explore Models Blueprints GPUs Docs
Terms of Use

|

Privacy Policy

|

Manage My Privacy

|

Contact

Copyright © 2025 NVIDIA Corporation

Search Results

Searching for: AST
Sorting by Most Recent

metallama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

llm multimodal safetycontent safetyguardrailcontent moderatormeta

nvidiallama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

nemo retrieverembeddingretrieval augmented generationtext-to-embeddingnvidia

nvidiaSafety for Agentic AI

Improve safety, security, and privacy of AI systems at build, deploy and run stages.

securitylaunchableblueprintsafetyprivacynemo guardrailsopen modelsnvidia ainvidia

speakleashbielik-11b-v2.3-instruct

State-of-the-art model for Polish language processing tasks such as text generation, Q&A, and chatbots.

polishsovereign aichatchatbotssummarizationspeakleash

nvidiallama-3.1-nemotron-nano-4b-v1.1

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

edgetool callingreasoningmathnvidia

marinmarin-8b-instruct

State-of-the-art open model trained on open datasets, excelling in reasoning, math, and science.

reasoningchatscienceopen modelmathmarin

qwenqwen3-235b-a22b

Advanced reasoing MOE mode excelling at reasoning, multilingual tasks, and instruction following

chatcomplex mathadvanced reasoninginstruction followingqwen

black-forest-labsFLUX.1-schnell

FLUX.1-schnell is a distilled image generation model, producing high quality images at fast speeds

image generationtext-to-imagerun-on-rtxblack-forest-labs

utter-projecteurollm-9b-instruct

State-of-the-art, multilingual model tailored to all 24 official European Union languages.

sovereign aichatchattext-to-textmultilingualeuropeanregional language generationutter-project

mistralaimistral-small-3.1-24b-instruct-2503

Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses

language generationmultimodalimage understandingmistralai

nvidiaparakeet-1.1b-rnnt-multilingual-asr

High accuracy and optimized performance for transcription in 25 languages

asrstreamingspeech-to-textmultilingualnvidia nimnvidia

black-forest-labsFLUX.1-dev

FLUX.1 is a state-of-the-art suite of image generation models

image generationtext-to-imagerun-on-rtxblack-forest-labs

nvidiacosmos-predict1-7b

Generalist model to generate future world state as videos from text and image prompts to create synthetic training data for robots and autonomous vehicles.

synthetic data generationautonomous vehiclesphysical airoboticstext-to-worldimage-to-worldnvidia

nvidiaTest Multi-Robot Fleets for Industrial Automation

Simulate, test, and optimize physical AI and robotic fleets at scale in industrial digital twins before real-world deployment.

industrialnvidia omniverseblueprintsimulationenterpriseomniverse blueprintnvidia

nvidiaLLM Router

Route LLM requests to the best model for the task at hand.

launchableblueprintllm routernvidia ainvidia

openaiwhisper-large-v3

Robust Speech Recognition via Large-Scale Weak Supervision.

asrastspeech-to-textbatchwhisperopenaimultilingualnvidia nimnvidia rivaopenai

nvidiacanary-1b-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asraststreamingspeech-to-textbatchspanishmultilingualnvidia nimnvidia rivanvidia

nvidiacanary-0.6b-turbo-asr

Multi-lingual model supporting speech-to-text recognition and translation.

asrastfastspeech-to-textbatchmultilingualnvidia nimnvidia rivanvidia

deepseek-aideepseek-r1

State-of-the-art, high-efficiency LLM excelling in reasoning, math, and coding.

chatmathadvanced reasoningdeepseek-ai

nvidiaBuild an Enterprise RAG pipeline

Continuously extract, embed, and index multimodal data for fast, accurate semantic search. Built on world-class NeMo Retriever models, the RAG blueprint connects AI applications to multimodal enterprise data wherever it resides.

nimlaunchableblueprintenterpriseretrieval-augmented generationnvidia ainemo retrievernvidia

metasam2

SAM 2 is a segmentation model that enables fast, precise selection of any object in any video or image.

metacomputer visionsegmentationvideo

nvidiausdcode

State-of-the-art LLM that answers OpenUSD knowledge queries and generates USD-Python code.

openusdsynthetic data generationdigital twincode generationchatnvidia nimnvidia

university-at-buffalocached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nemo retrieverchart element detectionimage-to-textuniversity-at-buffalo

baidupaddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

optical character recognitiontable extractionoptical character detectionnemo retrieverdata ingestionrun-on-rtxextractionbaidu

nvidiaconformer-ctc-asr

Automatic speech recognition model that transcribes speech in lower case English with record-setting accuracy and performance

asrstreamingspeech-to-textspanishnvidia nimnvidia rivanvidia

nvidiamistral-nemo-minitron-8b-8k-instruct

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

small language modelchatcode generationchattext-to-textlanguage generationnvidia

metallama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chatcode generationchattext-to-textlanguage generationmeta

metallama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

chatcode generationtext-to-textlanguage generationmeta

nvidiamistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generationtext-to-textchatsmall language modelnvidia

rakutenrakutenai-7b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatchattext-to-textlanguage generationlarge language modelsrakuten

rakutenrakutenai-7b-chat

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatchattext-to-textlanguage generationlarge language modelsrakuten

nvidiaparakeet-ctc-1.1b-asr

Record-setting accuracy and performance for English transcription.

asrstreamingenglishspeech-to-textbatchnvidia nimnvidia

nvidiaparakeet-ctc-0.6b-asr

State-of-the-art accuracy and speed for English transcriptions.

asrstreamingenglishbatchspeech-to-textfastnvidia nimrun-on-rtxnvidia

metallama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

code generationchattext-to-textlanguage generationrun-on-rtxmeta

googlepaligemma

Vision language model adept at comprehending text and visual inputs to produce informative responses

imagecvvision assistantvlmvisual question answeringcomputer visionlanguage generationimage-to-textvideogoogle

microsoftphi-3-mini-4k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

databricksdbrx-instruct

A general-purpose LLM with state-of-the-art performance in language understanding, coding, and RAG.

chatchattext-to-textlanguage generationlarge language modelsdatabricks

microsoftphi-3-mini-128k-instruct

Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.

chatcode generationchattext-to-textlanguage generationlarge language modelsmicrosoft

metallama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.

chatcode generationchattext-to-textlanguage generationlarge language modelsmeta

stabilityaistable-video-diffusion

Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.

image generationtext-to-imagestabilityai

stabilityaisdxl-turbo

A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation

image generationtext-to-imagestabilityai