Try NVIDIA NIM APIs

Explore

Models

Blueprints

GPUs

Docs

Your Privacy Choices

Contact

Search Results

Searching for: NeMo

Sorting by Most Recent

nvidia Domain-Specific Reasoning LLMs

Accelerate domain adaptation of LLMs.

nvidia llama-3_2-nemoretriever-300m-embed-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

nvidia Retail Shopping Assistant

Elevate Shopping Experiences Online and In Stores.

nvidia nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

nvidia nemoretriever-ocr-v1

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

nvidia Streaming Data to RAG

Sensor-captured radio enables real-time awareness, AI-driven analytics for actionable, searchable insights.

nvidia llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

nvidia llama-3_2-nemoretriever-300m-embed-v1

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

nvidia nemoretriever-ocr

Powerful OCR model for fast, accurate real-world image text extraction, layout, and structure analysis.

nvidia llama-3.2-nemoretriever-500m-rerank-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nvidia llama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

mistralai mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

nvidia Refine AI Agents through Continuous Model Distillation with Data Flywheels

Build a data flywheel, with NVIDIA NeMo microservices, that continuously optimizes AI agents for latency and cost — while maintaining accuracy targets.

iguazio AI Orchestration for Data Flywheel

Orchestrate AI agents for data flywheel with MLRun and NVIDIA NeMo microservices.

nvidia Safety for Agentic AI

Improve safety, security, and privacy of AI systems at build, deploy and run stages.

nvidia llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

nvidia llama-3.1-nemotron-nano-4b-v1.1

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

nvidia llama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

nvidia Build an AI Agent for Enterprise Research

Build a custom deep researcher powered by state-of-the-art models that continuously process and synthesize multimodal enterprise data, enabling reasoning, planning, and refinement to generate comprehensive reports.

nvidia llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

nvidia llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

nvidia nv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

nvidia nemoretriever-table-structure-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

nvidia nemoretriever-graphic-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

nvidia nemoretriever-page-elements-v2

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

nvidia nemoretriever-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

nvidia Build an Enterprise RAG pipeline

Continuously extract, embed, and index multimodal data for fast, accurate semantic search. Built on world-class NeMo Retriever models, the RAG blueprint connects AI applications to multimodal enterprise data wherever it resides.

nvidia llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

nvidia nemoguard-jailbreak-detect

Industry leading jailbreak classification model for protection from adversarial attempts

nvidia llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

nvidia cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

nvidia llama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

nvidia llama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

university-at-buffalo cached

Context-aware chart extraction that can detect 18 classes for chart basic elements, excluding plot elements.

nvidia nv-yolox-page-elements-v1

Model for object detection, fine-tuned to detect charts, tables, and titles in documents.

baidu paddleocr

Model for table extraction that receives an image as input, runs OCR on the image, and returns the text within the image and its bounding boxes.

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

nvidia llama-3.1-nemotron-70b-instruct

Llama-3.1-Nemotron-70B-Instruct is a large language model customized by NVIDIA in order to improve the helpfulness of LLM generated responses.

nvidia llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

nvidia llama-3.1-nemotron-51b-instruct

Unique language model that delivers an unmatched accuracy-efficiency performance.

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

nvidia mistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

nv-mistralai mistral-nemo-12b-instruct

Most advanced language model for reasoning, code, multilingual tasks; runs on a single GPU.

nvidia nv-rerankqa-mistral-4b-v3

Multilingual text reranking model.

nvidia nv-embedqa-e5-v5

English text embedding model for question-answering retrieval.

nvidia nv-embedqa-mistral-7b-v2

Multilingual text question-answering retrieval, transforming textual information into dense vector representations.

snowflake arctic-embed-l

Optimized community model for text embedding.

google deplot

Translate images of plots into tables with one-shot visual language understanding.