Refactor imports

This commit is contained in:
Lewis Tunstall
2023-11-10 13:38:45 +00:00
parent 7f1a14e0d4
commit e2e8ab945d
6 changed files with 58 additions and 18 deletions
+39 -4
View File
@@ -1,7 +1,7 @@
## Scripts to Train and Evaluate Chat Models
# Scripts to Train and Evaluate Chat Models
### Fine-tuning
## Fine-tuning
In the handbook, we provide three main ways to align LLMs for chat:
@@ -47,7 +47,7 @@ By default all training metrics are logged with TensorBoard. If you have a [Weig
ACCELERATE_LOG_LEVEL=info accelerate launch --config_file recipes/accelerate_configs/deepspeed_zero3.yaml scripts/run_{task}.py recipes/{model_name}/{task}/config_full.yaml --report_to=wandb
```
### Launching jobs on a Slurm cluster
## Launching jobs on a Slurm cluster
If you have access to a Slurm cluster, we provide a `recipes/launch.slurm` script that will automatically queue training jobs for you. Here's how you can use it:
@@ -63,4 +63,39 @@ sbatch --job-name=handbook_sft --nodes=1 recipes/launch.slurm zephyr-7b-beta sft
You can scale the number of nodes by increasing the `--nodes` flag.
**⚠️ Note:** the configuration in `recipes/launch.slurm` is optimised for the Hugging Face Compute Cluster and may require tweaking to be adapted to your own compute nodes.
**⚠️ Note:** the configuration in `recipes/launch.slurm` is optimised for the Hugging Face Compute Cluster and may require tweaking to be adapted to your own compute nodes.
## Fine-tuning on custom datasets
Under the hood, each training script uses the `get_datasets()` function which allows one to easily combing multiple datasets with varying proportions. For instance, this is how one can specify multiple datasets and which splits to combine in one of the YAML configs:
```yaml
datasets_mixer:
dataset_1: 0.5 # Use 50% of the training examples
dataset_2: 0.66 # Use 66% of the training examples
dataset_3: 0.10 # Use 10% of the training examples
dataset_splits:
- train_x # Samples from each train split
- test_x # Test splits aren't sampled
```
If you want to fine-tune on your own datasets, the main thing to keep in mind is how the chat templates are applied to the dataset blend. Since each task (SFT, DPO, etc), requires a different format, we assume the datasets have the following columns:
**SFT**
* `messages`: A list of `dicts` in the form `{"role": "{role}", "content": {content}}`.
* See [ultrachat_200k](https://huggingface.co/datasets/HuggingFaceH4/ultrachat_200k) for an example.
**DPO**
* `chosen`: A list of `dicts` in the form `{"role": "{role}", "content": {content}}` corresponding to the preferred dialogue.
* `rejected`: A list of `dicts` in the form `{"role": "{role}", "content": {content}}` corresponding to the dispreferred dialogue.
* See [ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized) for an example.
We also find it useful to include dedicated splits per task in our datasets, so e.g. we have:
* `{train,test}_sft`: Splits for SFT training.
* `{train,test}_gen`: Splits for generation ranking like rejection sampling or PPO.
* `{train,test}_prefs`: Splits for preference modelling, like reward modelling or DPO.
If you format your dataset in the same way, our training scripts should work out of the box!
+12 -9
View File
@@ -18,7 +18,7 @@ import sys
import torch
import transformers
from transformers import set_seed
from transformers import AutoModelForCausalLM, set_seed
from accelerate import Accelerator
from alignment import (
@@ -32,11 +32,11 @@ from alignment import (
get_peft_config,
get_quantization_config,
get_tokenizer,
is_adapter_model,
)
from trl import DPOTrainer
from transformers import AutoModelForCausalLM
from alignment.model_utils import is_adapter_model
from peft import PeftConfig, PeftModel
from trl import DPOTrainer
logger = logging.getLogger(__name__)
@@ -114,15 +114,15 @@ def main():
device_map=get_kbit_device_map(),
quantization_config=get_quantization_config(model_args),
)
model = model_args.model_name_or_path
if is_adapter_model(model, model_args.model_revision):
# load the model, merge the adapter weights and unload the adapter
# Note: to run QLora, you will need to merge the based model separately as the merged model in 16bit
logger.info(f"Merging peft adapters for {model_args.model_name_or_path=}")
peft_config = PeftConfig.from_pretrained(model_args.model_name_or_path, revision=model_args.model_revision)
model_kwargs = dict(
revision=model_args.base_model_revision,
trust_remote_code=model_args.trust_remote_code,
@@ -131,9 +131,12 @@ def main():
use_cache=False if training_args.gradient_checkpointing else True,
)
base_model = AutoModelForCausalLM.from_pretrained(
peft_config.base_model_name_or_path, **model_kwargs,
peft_config.base_model_name_or_path,
**model_kwargs,
)
model = PeftModel.from_pretrained(
base_model, model_args.model_name_or_path, revision=model_args.model_revision
)
model = PeftModel.from_pretrained(base_model, model_args.model_name_or_path, revision=model_args.model_revision)
model.eval()
model = model.merge_and_unload()
model_kwargs = None