
State-of-the-art open model trained on open datasets, excelling in reasoning, math, and science.
Marin 8B Instruct is a Transformer-style autoregressive language model, fine-tuned from marin-8b-base, designed to follow instructions and engage in dialogue. This model is intended for tasks such as question answering, summarization, code generation, and dialogue.
dlwh at stanford.eduThis model is ready for non-commercial/research use.
This model is not owned or developed by NVIDIA. This model has been developed and built to a third-party's requirements for this application and use case; see link to Non-NVIDIA marin-8b-instruct Model Card.
GOVERNING TERMS: This trial service is governed by the NVIDIA API Trial Terms of Service. Use of this model is governed by the NVIDIA Community Model License. Additional Information: Apache 2.0.
Global
The Marin 8B Instruct model is designed for tasks requiring instruction comprehension and generation, such as question answering, summarization, code generation, and dialogue. It is positioned as a research artifact or a foundational instruct model upon which others can build and implement their own safety protocols.
Our Al models are designed and/or optimized to run on NVIDIA GPU-accelerated systems. By leveraging NVIDIA's hardware (e.g. GPU cores) and software frameworks (e.g., CUDA libraries), the model achieves faster training and inference times compared to CPU-only solutions.
[Preferred/Supported] Operating System(s):
marin-8b-instruct v1.0
The Marin-8b-Instruct model was adapted from marin-8b-base through Supervised Fine-Tuning (SFT) for an additional 5.3 billion tokens.
A full report is available on our ReadTheDocs site.
Marin 8B Instruct is currently an SFT-only model. It was trained on the following datasets:
We ran a suite of standard benchmarks to compare our model with Llama 3.1 8B, and the open source 7-8B models Olmo 2 7B, and MAP NEO 7B. For all benchmarks, we used LM Eval Harness with the default setup for each task. (These numbers may differ from reported results due to differences in setup. LM Eval Harness is usually somewhat stricter than other harnesses.)
| Average | AGI Eval LSAT-AR | ARC Easy | ARC Challenge | BBH | BoolQ | CommonSense QA | COPA | GPQA | HellaSwag 0-shot | HellaSwag 10-shot | lambada_openai | MMLU 5-shot | MMLU 0-shot | MMLU Pro | OpenBookQA | PIQA | WinoGrande | WSC | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Marin 8B Base (Starling) | 68.3 | 20.9 | 86.5 | 63.1 | 50.6 | 85.9 | 79.1 | 92.0 | 30.3 | 82.3 | 83.6 | 74.7 | 67.6 | 65.9 | 36.5 | 44.2 | 84.4 | 74.5 | 82.1 |
| Llama 3.1 Base | 67.0 | 20.4 | 85.8 | 58.9 | 46.4 | 84.2 | 75.2 | 92.0 | 32.3 | 79.4 | 81.9 | 74.7 | 66.4 | 65.5 | 33.3 | 45.8 | 82.9 | 74.4 | 83.5 |
| OLMo 2 Base | 66.7 | 17.4 | 85.0 | 60.7 | 44.4 | 85.5 | 75.4 | 89.0 | 26.8 | 80.5 | 81.7 | 73.1 | 63.9 | 61.9 | 30.6 | 46.2 | 82.5 | 74.3 | 86.1 |
| MAP NEO 7B | 62.2 | 23.0 | 81.1 | 52.0 | 42.4 | 84.7 | 81.7 | 82.0 | 27.8 | 72.5 | 73.3 | 64.6 | 58.2 | 56.4 | TODO | 39.4 | 79.0 | 66.1 | 73.3 |
Marin 8B Base fares well on most tasks.
stanford-crfm/levanter training framework, which uses JAX and Named Tensors.stanford-crfm/marin-tokenizer (variant of Llama 3 tokenizer).Like any base language model or fine-tuned model without safety filtering, these models can easily be prompted by users to generate harmful and sensitive content. Such content may also be produced unintentionally, especially in cases involving bias, so we recommend that users consider the risks when applying this technology. Additionally, many statements from Marin or any LLM are often inaccurate, so responses should be verified.
Marin 8B has not undergone any safety tuning or evaluation. We strongly recommend that users use this model with caution and consider the risks when applying this technology. In particular, this model is not intended for fully autonomous use.
NVIDIA believes Trustworthy Al is a shared responsibility and we have established policies and practices to enable development for a wide array of Al applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse. Please report security vulnerabilities or NVIDIA AI Concerns here.