LLaMA Factory

1 HR

Install and fine-tune models with LLaMA Factory

Verify system prerequisites

Check that your NVIDIA Spark system has the required components installed and accessible.

nvcc --version
nvidia-smi
python3 --version
git --version

Create and activate a Python virtual environment

Create a virtual environment and activate it for the LLaMA Factory installation.

python3 -m venv factoryEnv
source ./factoryEnv/bin/activate

Install PyTorch with CUDA 13 support

Install PyTorch, torchvision, and torchaudio with CUDA 13.0 support from the official PyTorch index.

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu130

Verify PyTorch CUDA support

Confirm that PyTorch can see the GPU.

python -c "import torch; print(f'PyTorch: {torch.__version__}, CUDA: {torch.cuda.is_available()}')"

Clone LLaMA Factory repository

Download the LLaMA Factory source code from the official repository.

git clone --depth 1 https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory

Install LLaMA Factory with dependencies

Install LLaMA Factory in editable mode with metrics support.

pip install -e ".[metrics]"

Prepare training configuration

Examine the provided LoRA fine-tuning configuration for Qwen3.

cat examples/train_lora/qwen3_lora_sft.yaml

Launch fine-tuning training

NOTE

Login to your Hugging Face Hub to download the model if the model is gated.

Execute the training process using the pre-configured LoRA setup.

hf auth login   # if the model is gated
llamafactory-cli train examples/train_lora/qwen3_lora_sft.yaml

Example output:

***** train metrics *****
  epoch                    =        3.0
  total_flos               = 11076559GF
  train_loss               =     0.9993
  train_runtime            = 0:14:32.12
  train_samples_per_second =      3.749
  train_steps_per_second   =      0.471
Figure saved at: saves/qwen3-4b/lora/sft/training_loss.png

Validate training completion

Verify that training completed successfully and checkpoints were saved.

ls -la saves/qwen3-4b/lora/sft/

Expected output should show:

  • Final checkpoint directory (checkpoint-411 or similar)
  • Model configuration files (adapter_config.json)
  • Training metrics showing decreasing loss values
  • Training loss plot saved as PNG file

Test inference with fine-tuned model

Test your fine-tuned model with custom prompts:

llamafactory-cli chat examples/inference/qwen3_lora_sft.yaml
# Type: "Hello, how can you help me today?"
# Expect: Response showing fine-tuned behavior

For production deployment, export your model

llamafactory-cli export examples/merge_lora/qwen3_lora_sft.yaml

Cleanup and rollback

WARNING

This will delete all training progress and checkpoints.

To remove the virtual environment and cloned repository:

deactivate
cd ..
rm -rf LLaMA-Factory/
rm -rf factoryEnv/