---
title: "minimax-m2.1"
publisher: "minimaxai"
type: "endpoint"
updated: "2026-02-03T03:11:58.779Z"
description: "MiniMax M2.1 excels in multi-language coding, app/web dev, office AI, and agent integration"
canonical: "https://build.nvidia.com/minimaxai/minimax-m2_1"
---

# MiniMax-M2.1

## Description
MiniMax-M2.1 is a large language model optimized for agentic capabilities including coding, tool use, instruction following, and long-horizon planning. The model is designed to shatter the stereotype that high-performance agents must remain behind closed doors, enabling developers to build autonomous applications for multilingual software development and complex multi-step workflows.

*This model is ready for commercial/non-commercial use.*

## Third-Party Community Consideration:
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see link to Non-NVIDIA [MiniMax-M2.1 Model Card](https://huggingface.co/MiniMaxAI/MiniMax-M2.1)

## License and Terms of Use:

GOVERNING TERMS: Your use of the service is governed by the [NVIDIA API Catalog Terms of Service](https://assets.ngc.nvidia.com/products/api-catalog/legal/NVIDIA%20API%20Trial%20Terms%20of%20Service.pdf). Your use of the model is governed by the [NVIDIA Open Model License Agreement](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/). ADDITIONAL INFORMATION: [Modified MIT License](https://github.com/MiniMax-AI/MiniMax-M2.1/blob/main/LICENSE).

## Deployment Geography:
Global

## Use Case:
**Use Case:** Developers and enterprises building autonomous AI agents for software engineering tasks, multilingual code development, automated workflows, tool calling, and long-horizon planning applications.

## Release Date:
**Build.NVIDIA.com:** 01/2026 via [link](https://build.nvidia.com/minimaxai/minimax-m2_1)  
**Huggingface:** 12/20/2025 via [link](https://huggingface.co/MiniMaxAI/MiniMax-M2.1)

## Reference(s):
**References:**
- [MiniMax-M2.1 on Hugging Face](https://huggingface.co/MiniMaxAI/MiniMax-M2.1)
- [MiniMax Open Platform API](https://platform.minimax.io/docs/guides/text-generation)
- [MiniMax Agent](https://agent.minimax.io/)
- [arXiv Paper: WebExplorer](https://arxiv.org/abs/2509.06501)
- [VIBE Benchmark](https://huggingface.co/datasets/MiniMaxAI/VIBE)

## Model Architecture:
**Architecture Type:** Transformer  
**Network Architecture:** Mixture-of-Experts Transformer  
**Total Parameters:** 230B

### Input:
**Input Types:** Text  
**Input Formats:** String  
**Input Parameters:** One Dimensional (1D)  
**Other Input Properties:** Input text is tokenized using the model's native tokenizer. Recommended inference parameters: temperature=1.0, top_p=0.95, top_k=40.

### Output:
**Output Types:** Text  
**Output Format:** String  
**Output Parameters:** One Dimensional (1D)  
**Other Output Properties:** Generated text responses with support for tool calling and structured outputs.

__Our AI models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.__

## Software Integration:
**Runtime Engines:**
- **SGLang:** Recommended for serving MiniMax-M2.1
- **vLLM:** Recommended for serving MiniMax-M2.1
- **Transformers:** Supported for local deployment
- **Other:** KTransformers

**Supported Hardware:**
- **NVIDIA Ampere:** A100, A6000, A40
- **NVIDIA Blackwell:** B200, B100, GB200
- **NVIDIA Hopper:** H100, H200
- **NVIDIA Lovelace:** L40S, L40, RTX 6000 Ada Generation

**Preferred/Supported Operating Systems:** Linux

__The integration of foundation and fine-tuned models into AI systems requires additional testing using use-case-specific data to ensure safe and effective deployment. Following the V-model methodology, iterative testing and validation at both unit and system levels are essential to mitigate risks, meet technical and functional requirements, and ensure compliance with safety and ethical standards before deployment.__

## Model Version(s)
MiniMax-M2.1 v2.1

## Training, Testing, and Evaluation Datasets:

### Training Dataset
**Data Modality:** Text  
**Training Data Collection:** Undisclosed  
**Training Labeling:** Undisclosed  
**Training Properties:** Undisclosed

### Testing Dataset
**Testing Data Collection:** Undisclosed  
**Testing Labeling:** Undisclosed  
**Testing Properties:** Undisclosed

### Evaluation Dataset
**Evaluation Benchmark Score:** MiniMax-M2.1 achieves 74.0% on SWE-bench Verified, 49.4% on Multi-SWE-bench, 72.5% on SWE-bench Multilingual, and 47.9% on Terminal-bench 2.0. The model demonstrates strong performance across coding, tool use, and full-stack development benchmarks.

<details>
<summary><strong>Detailed Benchmark Comparison Table</strong></summary>

| Benchmark | MiniMax-M2.1 | MiniMax-M2 | Claude Sonnet 4.5 | Claude Opus 4.5 | Gemini 3 Pro | GPT-5.2 (thinking) | DeepSeek V3.2 |
| --- | --- | --- | --- | --- | --- | --- | --- |
| SWE-bench Verified | 74.0 | 69.4 | 77.2 | 80.9 | 78.0 | 80.0 | 73.1 |
| Multi-SWE-bench | 49.4 | 36.2 | 44.3 | 50.0 | 42.7 | x | 37.4 |
| SWE-bench Multilingual | 72.5 | 56.5 | 68 | 77.5 | 65.0 | 72.0 | 70.2 |
| Terminal-bench 2.0 | 47.9 | 30.0 | 50.0 | 57.8 | 54.2 | 54.0 | 46.4 |

| Benchmark | MiniMax-M2.1 | MiniMax-M2 | Claude Sonnet 4.5 | Claude Opus 4.5 | Gemini 3 Pro | GPT-5.2 (thinking) | DeepSeek V3.2 |
| --- | --- | --- | --- | --- | --- | --- | --- |
| SWE-bench Verified (Droid) | 71.3 | 68.1 | 72.3 | 75.2 | x | x | 67.0 |
| SWE-bench Verified (mini-swe-agent) | 67.0 | 61.0 | 70.6 | 74.4 | 71.8 | 74.2 | 60.0 |
| SWT-bench | 69.3 | 32.8 | 69.5 | 80.2 | 79.7 | 80.7 | 62.0 |
| SWE-Perf | 3.1 | 1.4 | 3.0 | 4.7 | 6.5 | 3.6 | 0.9 |
| SWE-Review | 8.9 | 3.4 | 10.5 | 16.2 | x | x | 6.4 |
| OctoCodingbench | 26.1 | 13.3 | 22.8 | 36.2 | 22.9 | x | 26.0 |

| Benchmark | MiniMax-M2.1 | MiniMax-M2 | Claude Sonnet 4.5 | Claude Opus 4.5 | Gemini 3 Pro |
| --- | --- | --- | --- | --- | --- |
| VIBE (Average) | 88.6 | 67.5 | 85.2 | 90.7 | 82.4 |
| VIBE-Web | 91.5 | 80.4 | 87.3 | 89.1 | 89.5 |
| VIBE-Simulation | 87.1 | 77.0 | 79.1 | 84.0 | 89.2 |
| VIBE-Android | 89.7 | 69.2 | 87.5 | 92.2 | 78.7 |
| VIBE-iOS | 88.0 | 39.5 | 81.2 | 90.0 | 75.8 |
| VIBE-Backend | 86.7 | 67.8 | 90.8 | 98.0 | 78.7 |

| Benchmark | MiniMax-M2.1 | MiniMax-M2 | Claude Sonnet 4.5 | Claude Opus 4.5 | Gemini 3 Pro | GPT-5.2 (thinking) | DeepSeek V3.2 |
| --- | --- | --- | --- | --- | --- | --- | --- |
| Toolathlon | 43.5 | 16.7 | 38.9 | 43.5 | 36.4 | 41.7 | 35.2 |
| BrowseComp | 47.4 | 44.0 | 19.6 | 37.0 | 37.8 | 65.8 | 51.4 |
| BrowseComp (context management) | 62.0 | 56.9 | 26.1 | 57.8 | 59.2 | 70.0 | 67.6 |
| AIME25 | 83.0 | 78.0 | 88.0 | 91.0 | 96.0 | 98.0 | 92.0 |
| MMLU-Pro | 88.0 | 82.0 | 88.0 | 90.0 | 90.0 | 87.0 | 86.0 |
| GPQA-D | 83.0 | 78.0 | 83.0 | 87.0 | 91.0 | 90.0 | 84.0 |
| HLE w/o tools | 22.2 | 12.5 | 17.3 | 28.4 | 37.2 | 31.4 | 22.2 |
| LCB | 81.0 | 83.0 | 71.0 | 87.0 | 92.0 | 89.0 | 86.0 |
| SciCode | 41.0 | 36.0 | 45.0 | 50.0 | 56.0 | 52.0 | 39.0 |
| IFBench | 70.0 | 72.0 | 57.0 | 58.0 | 70.0 | 75.0 | 61.0 |
| AA-LCR | 62.0 | 61.0 | 66.0 | 74.0 | 71.0 | 73.0 | 65.0 |
| τ²-Bench Telecom | 87.0 | 87.0 | 78.0 | 90.0 | 87.0 | 85.0 | 91.0 |

**Evaluation Methodology Notes:**
- **SWE-bench Verified:** Tested on internal infrastructure using [Claude Code](https://github.com/anthropics/claude-code), [Droid](https://factory.ai/), or [mini-swe-agent](https://github.com/SWE-agent/mini-SWE-agent) as scaffolding. Default system prompt was overridden. Results represent the average of 4 runs.
- **Multi-SWE-Bench & SWE-bench Multilingual & SWT-bench & SWE-Perf:** Tested using Claude Code as scaffolding, with default system prompt overridden. Results represent the average of 4 runs.
- **Terminal-bench 2.0:** Tested using Claude Code. Full dataset verified and environmental issues fixed. Timeout limits removed, other configurations consistent with official settings. Average of 4 runs.
- **SWE Review:** Internal benchmark for code defect review covering diverse languages and scenarios. Evaluates both defect recall and hallucination rates. "Correct" only if model accurately identifies target defect with no hallucinations. Average of 4 runs.
- **OctoCodingbench:** Internal benchmark for long-horizon instruction following in complex development scenarios. Uses "single-violation-failure" scoring mechanism. Average of 4 runs.
- **VIBE:** Uses Claude Code as scaffolding to automatically verify interactive logic and visual effects. Unified pipeline with containerized deployment and dynamic interaction environments. Average of 3 runs.
- **Toolathlon:** Evaluation protocol consistent with original paper.
- **BrowseComp:** Same agent framework as [WebExplorer](https://arxiv.org/pdf/2509.06501) with minor tool description fine-tuning. Uses 103-sample GAIA text-only validation subset.
- **BrowseComp (context management):** When token usage exceeds 30% of max context window, retains first AI response, last five AI responses, and tool outputs.
- **AIME25 ~ τ²-Bench Telecom:** Based on evaluation datasets and methodology from [Artificial Analysis Intelligence Index](https://artificialanalysis.ai/).

</details>

**Evaluation Data Collection:** Hybrid: Automated, Human  
**Evaluation Labeling:** Hybrid: Automated, Human  
**Evaluation Properties:** See Evaluation Methodology Notes above for detailed testing conditions per benchmark.

## Inference
**Acceleration Engine:** SGLang  
**Test Hardware:** H100x4

## Additional Details

### Recommended Inference Parameters
- **Temperature:** 1.0
- **Top-p:** 0.95
- **Top-k:** 40

### Default System Prompt
```
You are a helpful assistant. Your name is MiniMax-M2.1 and is built by MiniMax.
```

### Tool Calling
MiniMax-M2.1 supports tool calling capabilities. Refer to the [Tool Calling Guide](https://huggingface.co/MiniMaxAI/MiniMax-M2.1) for implementation details.

### Deployment Options
- **API Access:** Available via [MiniMax Open Platform](https://platform.minimax.io/docs/guides/text-generation)
- **MiniMax Agent:** Production deployment available at [agent.minimax.io](https://agent.minimax.io/)
- **Local Deployment:** Supported via SGLang, vLLM, or Transformers

### Known Capabilities
- Multilingual software development
- Complex multi-step office workflows
- Long-horizon planning
- Tool use and function calling
- Code generation and review
- Test case generation
- Code performance optimization

## Ethical Considerations

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Please report model quality, risk, security vulnerabilities or NVIDIA AI Concerns [here](https://www.nvidia.com/en-us/support/submit-security-vulnerability/).

## Prototype

```python
from openai import OpenAI

client = OpenAI(
base_url = "https://integrate.api.nvidia.com/v1",
api_key = "$NVIDIA_API_KEY"
)

completion = client.chat.completions.create(
model="",
messages=[{"role":"user","content":""}],
temperature=,
top_p=,
max_tokens=,
stream=NaN
)

print(completion.choices[0].message.content)
```

```python
from langchain_nvidia_ai_endpoints import ChatNVIDIA

client = ChatNVIDIA(
model="",
api_key="$NVIDIA_API_KEY", 
temperature=,
top_p=,
max_tokens=,
)

response = client.invoke([{"role":"user","content":""}])
print(response.content)
```

```javascript
import OpenAI from 'openai';

const openai = new OpenAI({
apiKey: '$NVIDIA_API_KEY',
baseURL: 'https://integrate.api.nvidia.com/v1',
})

async function main() {
const completion = await openai.chat.completions.create({
model: "",
messages: [{"role":"user","content":""}],
temperature: ,
top_p: ,
max_tokens: ,
stream: 
})

process.stdout.write(completion.choices[0]?.message?.content);

}

main();
```

```bash
invoke_url='https://integrate.api.nvidia.com/v1/chat/completions'

authorization_header='Authorization: Bearer '
accept_header='Accept: application/json'
content_type_header='Content-Type: application/json'

data=$'{
"messages": [
{
"role": "user",
"content": ""
}
]
}'

response=$(curl --silent -i -w "\n%{http_code}" --request POST \
--url "$invoke_url" \
--header "$authorization_header" \
--header "$accept_header" \
--header "$content_type_header" \
--data "$data"
)

echo "$response"
```