LLaMA Factory
Install and fine-tune models with LLaMA Factory
Verify system prerequisites
Check that your NVIDIA Spark system has the required components installed and accessible.
nvcc --version
docker --version
nvidia-smi
python --version
git --version
Launch PyTorch container with GPU support
Start the NVIDIA PyTorch container with GPU access and mount your workspace directory.
NOTE
This NVIDIA PyTorch container supports CUDA 13
docker run --gpus all --ipc=host --ulimit memlock=-1 -it --ulimit stack=67108864 --rm -v "$PWD":/workspace nvcr.io/nvidia/pytorch:25.09-py3 bash
Clone LLaMA Factory repository
Download the LLaMA Factory source code from the official repository.
git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
Install LLaMA Factory with dependencies
Install the package in editable mode with metrics support for training evaluation.
pip install -e ".[metrics]"
Verify Pytorch CUDA support.
PyTorch is pre-installed with CUDA support.
To verify installation:
python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"
Prepare training configuration
Examine the provided LoRA fine-tuning configuration for Llama-3.
cat examples/train_lora/llama3_lora_sft.yaml
Launch fine-tuning training
NOTE
Login to your hugging face hub to download the model if the model is gated.
Execute the training process using the pre-configured LoRA setup.
huggingface-cli login # if the model is gated
llamafactory-cli train examples/train_lora/llama3_lora_sft.yaml
Example output:
***** train metrics *****
epoch = 3.0
total_flos = 22851591GF
train_loss = 0.9113
train_runtime = 0:22:21.99
train_samples_per_second = 2.437
train_steps_per_second = 0.306
Figure saved at: saves/llama3-8b/lora/sft/training_loss.png
Validate training completion
Verify that training completed successfully and checkpoints were saved.
ls -la saves/llama3-8b/lora/sft/
Expected output should show:
- Final checkpoint directory (
checkpoint-21or similar) - Model configuration files (
config.json,adapter_config.json) - Training metrics showing decreasing loss values
- Training loss plot saved as PNG file
Test inference with fine-tuned model
Test your fine-tuned model with custom prompts:
llamafactory-cli chat examples/inference/llama3_lora_sft.yaml
# Type: "Hello, how can you help me today?"
# Expect: Response showing fine-tuned behavior
For production deployment, export your model
llamafactory-cli export examples/merge_lora/llama3_lora_sft.yaml
Cleanup and rollback
WARNING
This will delete all training progress and checkpoints.
To remove all generated files and free up storage space:
cd /workspace
rm -rf LLaMA-Factory/
docker system prune -f
To rollback Docker container changes:
exit # Exit container
docker container prune -f