
Cutting-edge vision-language model exceling in retrieving text and metadata from images.

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

Multimodal question-answer retrieval representing user queries as text and documents as images.

Built for agentic workflows, this model excels in coding, instruction following, and function calling

Multi-modal vision-language model that understands text/img and creates informative responses

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

Leading reasoning and agentic AI accuracy model for PC and edge.

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

Industry leading jailbreak classification model for protection from adversarial attempts

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

Multi-modal vision-language model that understands text/img/video and creates informative responses

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

Multilingual text reranking model.

English text embedding model for question-answering retrieval.

Multilingual text question-answering retrieval, transforming textual information into dense vector representations.

Optimized community model for text embedding.