llama-3.3-70b-instruct Model by Meta

The following are key system requirements and supported features to consider when self-hosting the llama-3.3-70b-instruct model.

GPU Memory Requirements

Precision	Minimum GPU Memory	Recommended GPU Memory
bf16	138 GB	180 GB
fp8	69 GB	90 GB

Deploying this NIM with less than the recommended amount of GPU memory requires setting the environment variable NIM_RELAX_MEM_CONSTRAINTS=1

Feature	Supported
LoRA Customization	✅
Fine-tuning Customization	✅
Tool Calling	✅
TensorRT-LLM Local Engine Building	✅