Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.

Advanced state-of-the-art model with language understanding, superior reasoning, and text generation.
The following are key system requirements and supported features to consider when self-hosting the llama-3.1-8b-instruct model.
| Precision | Minimum GPU Memory | Recommended GPU Memory |
|---|---|---|
| bf16 | 16 GB | 33 GB |
| fp8 | 8 GB | 16 GB |
Deploying this NIM with less than the recommended amount of GPU memory requires setting the environment variable NIM_RELAX_MEM_CONSTRAINTS=1
| Feature | Supported |
|---|---|
| LoRA Customization | ✅ |
| Fine-tuning Customization | ✅ |
| Tool Calling | ✅ |
| TensorRT-LLM Local Engine Building | ✅ |