FLUX.1 is a state-of-the-art suite of image generation models
A general purpose multimodal, multilingual 128 MoE model with 17B parameters.
A multimodal, multilingual 16 MoE model with 17B parameters.
Develop AI powered weather analysis and forecasting application visualizing multi-layered geospatial data.
Investigate, understand, and interpret single cell data in minutes, not days by leveraging RAPIDS-singlecell, powered by NVIDIA RAPIDS
Easily run essential genomics workflows to save time leveraging Parabricks
Generates physics-aware video world states from text and image prompts for physical AI development.
Generates future frames of a physics-aware world state based on simply an image or short video prompt for physical AI development.
Natural and expressive voices in multiple languages. For voice agents and brand ambassadors.
Generates a multiple sequence alignment from a query sequence and a protein sequence database search.
Cutting-edge open multimodal model exceling in high-quality reasoning from images.
Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
Cutting-edge open multimodal model exceling in high-quality reasoning from image and audio inputs.
Robust Speech Recognition via Large-Scale Weak Supervision.
Multi-lingual model supporting speech-to-text recognition and translation.
Multi-lingual model supporting speech-to-text recognition and translation.
NVIDIA DGX Cloud trained multilingual LLM designed for mission critical use cases in regulated industries including financial services, government, heavy industry
Instruction tuned LLM achieving SoTA performance on reasoning, math and general knowledge capabilities
Multilingual LLM with emphasis on European languages supporting regulated use cases including financial services, government, heavy industry
This blueprint shows how generative AI and accelerated NIM microservices can design protein binders smarter and faster.
Transform PDFs into AI podcasts for engaging on-the-go audio content.
Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.
Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.
Advanced LLM for reasoning, math, general knowledge, and function calling
Converts streamed audio to facial blendshapes for realtime lipsyncing and facial performances.
Automatic speech recognition model that transcribes speech in lower case English with record-setting accuracy and performance
FourCastNet predicts global atmospheric dynamics of various weather / climate variables.
Shutterstock Generative 3D service for 360 HDRi generation. Trained on NVIDIA Edify using Shutterstock’s licensed creative libraries.
Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA in order to improve the helpfulness of LLM generated responses.
State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
Cutting-edge vision-language model exceling in high-quality reasoning from images.
Cutting-edge vision-Language model exceling in high-quality reasoning from images.
Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.
Predicts the 3D structure of a protein from its amino acid sequence.
Generates consistent characters across a series of images without requiring additional training.
Predicts the 3D structure of a protein from its amino acid sequence.
Sovereign AI model finetuned on Traditional Mandarin and English data using the Llama-3 architecture.
This blueprint shows how generative AI and accelerated NIM microservices can design optimized small molecules smarter and faster.
Cutting-edge open multimodal model exceling in high-quality reasoning from images.
Advanced LLM based on Mixture of Experts architecure to deliver compute efficient content generation
Lightweight multilingual LLM powering AI applications in latency bound, memory/compute constrained environments
Grounding dino is an open vocabulary zero-shot object detection model.
Natural, high-fidelity, English voices for personalizing text-to-speech services and voiceovers
Enable smooth global interactions in 36 languages.
Expressive and engaging English voices for Q&A assistants, brand ambassadors, and service robots
Record-setting accuracy and performance for English transcription.
State-of-the-art accuracy and speed for English transcriptions.
ProteinMPNN is a deep learning model for predicting amino acid sequences for protein backbones.
Vision foundation model capable of performing diverse computer vision and vision language tasks.
Advanced small language generative AI model for edge applications
Create facial animations using a portrait photo and synchronize mouth movement with audio.
Verify compatibility of OpenUSD assets with instant RTX render and rule-based validation.
Model for writing and interacting with code across a wide range of programming languages and tasks.
Support Chinese and English chat, coding, math, instruction following, solving quizzes
Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks.
Powers complex conversations with superior contextual understanding, reasoning and text generation.
Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.
Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU.
Multilingual text reranking model.
English text embedding model for question-answering retrieval.
Cutting-edge lightweight open language model exceling in high-quality reasoning.
Advanced programming model for code completion, summarization, and generation
Advanced programming model for code completion, summarization, and generation
Cutting-edge text generation model text understanding, transformation, and code generation.
Cutting-edge text generation model text understanding, transformation, and code generation.
Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.
Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.
Grades responses on five attributes helpfulness, correctness, coherence, complexity and verbosity.
Creates diverse synthetic data that mimics the characteristics of real-world data.
Advanced text-to-image model for generating high quality images
Generates high-quality numerical embeddings from text inputs.
Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.
Visual Changenet detects pixel-level change maps between two images and outputs a semantic change segmentation mask
Advanced programming model for code generation, completion, reasoning, and instruction following.
Software programming LLM for code generation, completion, explanation, and multi-turn conversion.
Software programming LLM for code generation, completion, explanation, and multi-turn conversion.
EfficientDet-based object detection network to detect 100 specific retail objects from an input video.
A generative model of protein backbones for protein binder design.
Cutting-edge lightweight open language model exceling in high-quality reasoning.
Long context cutting-edge lightweight open language model exceling in high-quality reasoning.
Cutting-edge lightweight open language model exceling in high-quality reasoning.
Cutting-edge open multimodal model exceling in high-quality reasoning from images.
Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.
Optimized community model for text embedding.
Lightweight, state-of-the-art open LLM with strong math and logical reasoning skills.
An MOE LLM that follows instructions, completes requests, and generates creative text.
Powers complex conversations with superior contextual understanding, reasoning and text generation.
Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.
Novel recurrent architecture based language model for faster inference when generating long sequences.
Cutting-edge model built on Google's Gemma-7B specialized for code generation and code completion.
GPU-accelerated generation of text embeddings used for question-answering retrieval.
GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.
Generate images and stunning visuals with realistic aesthetics.
Run Google's DeepVariant optimized for GPU. Switch models for high accuracy on all major sequencers.
Stable Video Diffusion (SVD) is a generative diffusion model that leverages a single image as a conditioning frame to synthesize video sequences.
A fast generative text-to-image model that can synthesize photorealistic images from a text prompt in a single network evaluation
An MOE LLM that follows instructions, completes requests, and generates creative text.