alignment-handbook

wassname/alignment-handbook

Fork 0

mirror of https://github.com/wassname/alignment-handbook.git synced 2026-06-27 18:41:19 +08:00

Files

T

History

Lewis Tunstall 2de17f5ba1 Add doc

2023-11-09 07:32:24 +00:00

README.md

Add doc

2023-11-09 07:32:24 +00:00

run_dpo.py

Make DPO work!

2023-11-08 22:58:34 +00:00

run_sft.py

Make it work for realz

2023-11-08 22:20:17 +00:00

README.md

Supervised Fine-Tuning (SFT)

We provide 3 main ways to train SFT models:

Distributed fine-tuning of all model weights with ZeRO-3
Fine-tuning with LoRA adapters and ZeRO-3
Fine-tuning with QLoRA adapters and DDP

# Full training with ZeRO-3
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_sft.py recipes/{model_name}/sft/config_full.yaml

# LoRA training with ZeRO-3
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_sft.py recipes/{model_name}/sft/config_16bit.yaml

# QLoRA training with DDP
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/multi_gpu.yaml scripts/run_sft.py recipes/{model_name}/sft/config_8bit.yaml

You can override the parameters in each YAML config by appending them to the command as follows:

ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_sft.py recipes/{model_name}/sft/config_full.yaml --per_device_train_batch_size=2 --num_train_epochs=3

Direct Preference Optimisation (DPO)

# Full training with ZeRO-3
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_dpo.py recipes/{model_name}/dpo/config_full.yaml

# LoRA training with ZeRO-3
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_dpo.py recipes/{model_name}/dpo/config_16bit.yaml

# QLoRA training with DDP
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/multi_gpu.yaml scripts/run_dpo.py recipes/{model_name}/dpo/config_8bit.yaml