NVIDIA
Explore
Models
Blueprints
GPUs
Docs
Terms of Use
Privacy Policy
Your Privacy Choices
Contact

Copyright © 2025 NVIDIA Corporation

Search Results

Searching for: Inference
Sorting by Most Recent

nvidiabyom-llm

Deploy and test your custom HuggingFace models on NVIDIA infrastructure. This service enables rapid prototyping with any HuggingFace-compatible model, providing instant access to high-performance GPUs and a fully functional API endpoint.

deepseek-aideepseek-v3.1-terminus

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

nvidiallama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

nvidianemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

nvidianemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

metallama-3.1-405b-instruct

Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks.