NVIDIA
Explore
Models
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

Models

Deploy and scale models on your GPU infrastructure of choice with NVIDIA NIM inference microservices
Optimized by NVIDIALaunch from Hugging FaceBeta
Sorting by Most Recent

mistralaimistral-nemotron

Built for agentic workflows, this model excels in coding, instruction following, and function calling

language generationchatinstruction followingfunction calling

mistralaimistral-small-3.1-24b-instruct-2503

Efficient multimodal model excelling at multilingual tasks, image understanding, and fast-responses

language generationchatmultimodalimage understanding

mistralaimistral-medium-3-instruct

Powerful, multimodal language model designed for enterprise applications, including software development, data analysis, and reasoning.

language generationchatImage-to-Textmultimodalvisual question answering

nvidianv-embedcode-7b-v1

The NV-EmbedCode model is a 7B Mistral-based embedding model optimized for code retrieval, supporting text, code, and hybrid queries.

nemo retrieverEmbeddingRetrieval Augmented Generation

mistralaimistral-small-24b-instruct

Latency-optimized language model excelling in code, math, general knowledge, and instruction-following.

codechatreasoningagent-centricmultilingual

nvidiamistral-nemo-minitron-8b-base

State-of-the-art small language model delivering superior accuracy for chatbot, virtual assistants, and content generation.

language generationtext-to-textchatsmall language model

nvidianv-rerankqa-mistral-4b-v3

Multilingual text reranking model.

nemo retrieverRerankingRetrieval Augmented Generation

nvidianv-embedqa-mistral-7b-v2

Multilingual text question-answering retrieval, transforming textual information into dense vector representations.

nemo retrieverEmbeddingRetrieval Augmented Generation

mistralaimistral-7b-instruct-v0.3

This LLM follows instructions, completes requests, and generates creative text.

chatText-to-TextLanguage Generation

nvidiarerank-qa-mistral-4b

GPU-accelerated model optimized for providing a probability score that a given passage contains the information to answer a question.

RankingRetrieval Augmented Generation

mistralaimistral-7b-instruct-v0.2

This LLM follows instructions, completes requests, and generates creative text.

chatText-to-TextLanguage GenerationNVIDIA NIM