Try NVIDIA NIM APIs

Inkling is a multimodal (text + image) reasoning model from Thinking Machines — a Mamba-hybrid, 256-expert Mixture-of-Experts architecture with tool use and switchable reasoning.

Model

text-to-text

reasoning
image-to-text
multimodal

Last updated on July 16, 2026

Meta

DownloadableFree Endpoint

llama-3.1-70b-instruct

Powers complex conversations with superior contextual understanding, reasoning and text generation.

Model

Chat

Text-to-Text
Language Generation
Code Generation

5M API calls in the last 30 days

Last updated on June 12, 2025

Meta

DownloadableFree Endpoint

llama-3.2-1b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Model

chat

Text-to-Text
Language Generation
Code Generation

40K downloads in the last 30 days

545K API calls in the last 30 days

Last updated on May 21, 2025

Meta

DownloadableFree Endpoint

llama-3.2-3b-instruct

Advanced state-of-the-art small language model with language understanding, superior reasoning, and text generation.

Model

Chat

Text-to-Text
Language Generation
Code Generation

27K downloads in the last 30 days

1M API calls in the last 30 days

Last updated on May 22, 2025

Minimaxai

Free Endpoint

minimax-m3

MiniMax M3 Preview is a multimodal MoE vision-language model with strong reasoning, coding, and tool-calling capabilities.

Model

coding

text-to-text
reasoning

10M API calls in the last 30 days

Last updated on June 12, 2026

Upstage

DeprecatedFree Endpoint

solar-10.7b-instruct

Excels in NLP tasks, particularly in instruction-following, reasoning, and mathematics.

Model

Non-Commercial Use Only

chat
Text-to-Text
Language Generation
Large Language Models

527K API calls in the last 30 days

Last updated on April 10, 2025

OpenAI

DownloadableFree Endpoint

gpt-oss-120b

Mixture of Experts (MoE) reasoning LLM (text-only) designed to fit within 80GB GPU.

Model

reasoning

text-to-text
chat
math

45M API calls in the last 30 days

Last updated on August 5, 2025

OpenAI

DownloadableFree Endpoint

gpt-oss-20b

Smaller Mixture of Experts (MoE) text-only LLM for efficient AI reasoning and math

Model

reasoning

text-to-text
chat
math

19M API calls in the last 30 days

Last updated on August 5, 2025

Meta

DownloadableFree Endpoint

llama-3.1-8b-instruct

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

Model

Chat

Text-to-Text
Language Generation
Run-on-RTX
Code Generation

19M API calls in the last 30 days

Last updated on July 9, 2025

Meta

DownloadableFree Endpoint

llama-3.3-70b-instruct

Advanced LLM for reasoning, math, general knowledge, and function calling

Model

Instruction following

Math
Reasoning
Text-to-Text
Code Generation

27M API calls in the last 30 days

Last updated on June 12, 2025

Mistral AI

DownloadableFree Endpoint

mixtral-8x7b-instruct-v0.1

An MOE LLM that follows instructions, completes requests, and generates creative text.

Model

Advanced Reasoning

Chat
Text-to-Text
Large Language Models
Code Generation

1M API calls in the last 30 days

Last updated on July 18, 2025