Try NVIDIA NIM APIs

nvidia byom-llm

Deploy and test your custom HuggingFace models on NVIDIA infrastructure. This service enables rapid prototyping with any HuggingFace-compatible model, providing instant access to high-performance GPUs and a fully functional API endpoint.

deepseek-ai deepseek-v3.1-terminus

DeepSeek-V3.1: hybrid inference LLM with Think/Non-Think modes, stronger agents, 128K context, strict function calling.

nvidia llama-3.1-nemotron-ultra-253b-v1

Superior inference efficiency with highest accuracy for scientific and complex math reasoning, coding, tool calling, and instruction following.

nvidia nemotron-4-mini-hindi-4b-instruct

A bilingual Hindi-English SLM for on-device inference, tailored specifically for Hindi Language.

nvidia nemotron-mini-4b-instruct

Optimized SLM for on-device inference and fine-tuned for roleplay, RAG and function calling

meta llama-3.1-405b-instruct

Advanced LLM for synthetic data generation, distillation, and inference for chatbots, coding, and domain-specific tasks.

nvidiabyom-llm

deepseek-aideepseek-v3.1-terminus

nvidiallama-3.1-nemotron-ultra-253b-v1

nvidianemotron-4-mini-hindi-4b-instruct

nvidianemotron-mini-4b-instruct