Try NVIDIA NIM APIs

Explore

Models

Blueprints

GPUs

Docs

Your Privacy Choices

Contact

Search Results

Searching for: Llama Nemotron

Sorting by Most Recent

deepseek-ai deepseek-r1-distill-llama-70b

Distilled version of Llama 3.3 70B using reasoning data generated by DeepSeek R1 for enhanced performance.

nvidia Ambient Healthcare Agents

Build advanced AI agents for providers and patients using this developer example powered by NeMo Microservices, NVIDIA Nemotron, Riva ASR and TTS, and NVIDIA LLM NIM

nvidia nemotron-parse

Cutting-edge vision-language model exceling in retrieving text and metadata from images.

nvidia nemotron-nano-12b-v2-vl

Nemotron Nano 12B v2 VL enables multi-image and video understanding, along with visual Q&A and summarization capabilities.

nvidia llama-3.1-nemotron-safety-guard-8b-v3

Leading multilingual content safety model for enhancing the safety and moderation capabilities of LLMs

nvidia llama-3_2-nemoretriever-300m-embed-v2

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

nvidia 3D Object Generation

Transform your scene idea into ready-to-use 3D assets using Llama 3.1 8B, NV SANA, and Microsoft TRELLIS

nvidia nvidia-nemotron-nano-9b-v2

High‑efficiency LLM with hybrid Transformer‑Mamba design, excelling in reasoning and agentic tasks.

nvidia llama-3.3-nemotron-super-49b-v1.5

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

nvidia llama-3_2-nemoretriever-300m-embed-v1

Multilingual, cross-lingual embedding model for long-document QA retrieval, supporting 26 languages.

meta llama-guard-4-12b

Multi-modal model to classify safety for input prompts as well output responses.

nvidia llama-3.2-nemoretriever-500m-rerank-v2

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

nvidia llama-3.2-nemoretriever-1b-vlm-embed-v1

Multimodal question-answer retrieval representing user queries as text and documents as images.

mistralai mistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

nvidia llama-3.1-nemotron-nano-vl-8b-v1

Multi-modal vision-language model that understands text/img and creates informative responses

nvidia llama-3.1-nemotron-nano-4b-v1.1

State-of-the-art open model for reasoning, code, math, and tool calling - suitable for edge agents

nvidia llama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

meta llama-4-maverick-17b-128e-instruct

A general purpose multimodal, multilingual 128 MoE model with 17B parameters.

meta llama-4-scout-17b-16e-instruct

A multimodal, multilingual 16 MoE model with 17B parameters.

nvidia Build an AI for Enterprise Research

Build a custom enterprise research assistant powered by state-of-the-art models that process and synthesize multimodal data, enabling reasoning, planning, and refinement to generate comprehensive reports.

nvidia llama-3.3-nemotron-super-49b-v1

High efficiency model with leading accuracy for reasoning, tool calling, chat, and instruction following.

nvidia llama-3.1-nemotron-nano-8b-v1

Leading reasoning and agentic AI accuracy model for PC and edge.

deepseek-ai deepseek-r1-distill-llama-8b

Distilled version of Llama 3.1 8B using reasoning data generated by DeepSeek R1 for enhanced performance.

nvidia Build an Enterprise RAG Pipeline Blueprint

Power fast, accurate semantic search across multimodal enterprise data with NVIDIA’s RAG Blueprint—built on NeMo Retriever and Nemotron models—to connect your agents to trusted, authoritative sources of knowledge.

nvidia llama-3.1-nemoguard-8b-topic-control

Topic control model to keep conversations focused on approved topics, avoiding inappropriate content.

nvidia llama-3.1-nemoguard-8b-content-safety

Leading content safety model for enhancing the safety and moderation capabilities of LLMs

langchain Structured Report Generation

Generate detailed, structured reports on any topic using LangGraph and Llama3.3 70B NIM.

crewai Code Documentation for Software Development

Document your github repositories with AI Agents using CrewAI and Llama3.3 70B NIM.

nvidia cosmos-nemotron-34b

Multi-modal vision-language model that understands text/img/video and creates informative responses

nvidia llama-3.2-nv-embedqa-1b-v2

Multilingual and cross-lingual text question-answering retrieval with long context support and optimized data storage efficiency.

nvidia llama-3.2-nv-rerankqa-1b-v2

Fine-tuned reranking model for multilingual, cross-lingual text question-answering retrieval, with long context support.

meta llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

nvidia Vulnerability Analysis for Container Security

Rapidly identify and mitigate container security vulnerabilities with generative AI.

institute-of-science-tokyo llama-3.1-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

institute-of-science-tokyo llama-3.1-swallow-8b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

nvidia llama-3.1-nemotron-70b-reward

Leaderboard topping reward model supporting RLHF for better alignment with human preferences.

meta llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

meta llama-3.2-11b-vision-instruct

Cutting-edge vision-language model exceling in high-quality reasoning from images.

meta llama-3.2-90b-vision-instruct

Cutting-edge vision-Language model exceling in high-quality reasoning from images.

meta llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

abacusai dracarys-llama-3.1-70b-instruct

Fine-tuned Llama 3.1 70B model for code generation, summarization, and multi-language tasks.

yentinglin llama-3-taiwan-70b-instruct

Sovereign AI model finetuned on Traditional Mandarin and English data using the Llama-3 architecture.

tokyotech-llm llama-3-swallow-70b-instruct-v0.1

Sovereign AI model trained on Japanese language that understands regional nuances.

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

meta llama-3.1-405b-instruct

Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks.

meta llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

meta llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

nvidia llama3-chatqa-1.5-8b

Advanced LLM to generate high-quality, context-aware responses for chatbots and search engines.

meta llama3-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

meta llama3-8b-instruct

Advanced state-of-the-art LLM with language understanding, superior reasoning, and text generation.