alignment-handbook

mirror of https://github.com/wassname/alignment-handbook.git synced 2026-06-27 18:22:17 +08:00

Author	SHA1	Message	Date
Kashif Rasul	444e0f8414	Update README.md (#184 ) fix formatting	2024-07-30 11:05:50 +02:00
Kashif Rasul	98563353d7	CITATION.cff and fix F401 warning (#183 ) * fix F401 warning * add CITATION.cff * update version in CITATION * update title * fix label * Update src/alignment/__init__.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * make style * add Alvaro Bartolome * update version in readme --------- Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>	2024-07-30 10:56:17 +02:00
Kashif Rasul	95dc47218c	update API to use latest TRL (#182 ) * update API * update deepspeed * update black * remove unused import * fix typos * fix typos in readmes * fix grammer * removed as it exists in superclass * fixes in readme * Update README.md Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * add back dataset_kwargs * use hub_model_revision in sft and dpo * fix duplicate --------- Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>	2024-07-30 09:16:25 +02:00
Chansung Park	606d2e954f	Add fsdp+qlora support (#160 )	2024-05-08 15:08:13 +02:00
Zizheng Yang	84f8c92820	Update README.md (#152 ) If use 2.3.6, there will be an error ImportError: /root/miniconda3/envs/handbook/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops9_pad_enum4callERKNS_6TensorEN3c108ArrayRefINS5_6SymIntEEElNS5_8optionalIdEE If we use the newest flash_attn version, there will be no trouble!	2024-04-25 10:36:22 +02:00
Alvaro Bartolome	cf1975a7cb	Add ORPO within `README.md` files (#154 ) * Add `ORPO` within `scripts/README.md` * Fix typo in `ModelArguments.base_model_revision` * Add `ORPO` within `README.md` * Add Zephyr 141B in "News" section	2024-04-25 10:35:45 +02:00
Alvaro Bartolome	70769f9e9b	Add `run_orpo.py` (#143 ) * Add `ORPOConfig` * Add `task=orpo` and support `(prompt,chosen,rejected)` datasets * Add missing `model_init_kwargs` and `dataset_num_proc` * Add `run_orpo.py` (WIP) * Update `trl` dependency from source * Add `setup_chat_format` before `apply_chat_template` * Add `config_full.yaml` for `mistral-7b-orpo` * Fix comment indentation * Use `chat_template=chatml` instead * Add `kaist-ai/mistral-orpo-capybara-7k` recipe * Rename `DPOTrainer` to `ORPOTrainer` in `config_full.yaml` files * Run `black --line-length 119 src` * Add `is_openai_format` to fix `(prompt,chosen,rejected)` formatting * Run `black --line-length 119 src` * Fix `isort` in `run_orpo.py` * Update `mistral-capybara/orpo/config_full.yaml` * Check if `test` is available split * Pin `trl` to `alvarobartt/trl` fork (debugging) * Add `qwen-capybara` recipe * Update `mistral-capybara` recipe * Set `add_generation_prompt=True` if `task="orpo"` * Reduce `logging_steps` to 10 * Unset `add_generation_prompt` when `task=orpo` * Add filtering based on prompt length Done similarly to the original implementation, in order to better reproduce their results * Fix prompt length filtering * Update `trl` pinned version * Remove extra outdate config files * Update `recipes/mistral-capybara/orpo/config_full.yaml` * Run `make style` * Activate BEAST MODE * Pin deps * Add readme * Fix dep --------- Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>	2024-04-11 16:02:20 +02:00
kykim0	a83b1f617f	Fix the logic that causes an issue with philschmid/gemma-tokenizer-chatml tokenizer (#146 ) The `setup_chat_format()` logic should not be applied to philschmid/gemma-tokenizer-chatml tokenizer, otherwise gemma models are trained w/o proper bos, eos tokens.	2024-04-09 17:02:21 +02:00
Qingqing Cao	8497caeaf1	fix trust_remote_code for tokenizer in model_utils.py (#140 ) `trust_remote_code` option is only added to models, adding it to tokenizers to be consistent, which will also fix the error when the tokenizer is loaded from the remote repo	2024-03-27 19:31:45 +01:00
Bram Vanroy	ba7e0e4fca	Fix dataloading for cpt (#137 ) * avpid mutable parameter * do not remove text_column for cpt * fix typo * add * remove constant KEEPCOLS * update tests with columns_to_keep	2024-03-21 20:05:53 +01:00
Sergei Bogdanov	c44cb1cd1d	fix: Zephyr LoRA fine-tuning fixed (#139 ) Co-authored-by: svbogdanov <sergei@numind.ai>	2024-03-21 19:28:31 +01:00
Bram Vanroy	595023faa4	Adding continued_pretraining task (#131 ) * add continued pretraining script * simplify config; add dataset_config option * add ds configs in data mixer creator * use extended sftconfig * add option to avoid setting chat template * fix data_configs bug * add continued pretraining info * add gpt2-nl recipe for continued pretraining example * add final newline * make style * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/gpt2-nl/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * rename continued pretraining to cpt * improve README --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-03-14 15:15:23 +01:00
lewtun	a9b8a50a27	🌟 (#135 ) * Add StarChat2 * Add DPO * Fix unit test * Typos * Typo	2024-03-12 17:22:21 +01:00
lewtun	ff618a4d13	🪁 (#129 ) * Add Gemma 7B recipe * Use Gemma template * Make it work for dolly lol * Enable cahce * Clean up * DPO to the max * DPO, DPO, DPO * Add openhermes * Add custom configs * Add kwargs * Fix config * Bump deps * Move old recipes * Add doc * Add norte * Renable cache * Nuke * Clean * Apply suggestions from code review Co-authored-by: Alvaro Bartolome <alvaro@argilla.io> * Fix isort * Update README.md * Update config_full.yaml --------- Co-authored-by: Alvaro Bartolome <alvaro@argilla.io> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2024-03-01 17:29:42 +01:00
Bram Vanroy	d17fd7cd3b	Add `auto_insert_empty_system_msg` config flag (#123 ) * Make system messages optional Also use the `maybe_insert_system_message` in dpo setting * add `auto_insert_empty_system_msg` flag * add `auto_insert_empty_system_msg` * add auto_insert_empty_system_msg * Update src/alignment/configs.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * make style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-02-28 20:05:44 +01:00
lewtun	87cc800498	Apply quantization during DPO QLoRA (#115 ) * Add QLoRA fix * Update script	2024-02-05 16:50:17 +01:00
Ikko Eltociear Ashimine	d00e6f043e	Update README.md (#113 ) evalutions -> evaluations	2024-02-02 09:20:20 +01:00
Kosti	b4bd3a4984	Blog post url: "constitutional-ai" -> "constitutional_ai" (#112 )	2024-02-01 09:21:53 -08:00
lewtun	995d50912b	Update README.md (#111 ) * Update README.md * Update README.md	2024-02-01 17:02:43 +01:00
Costa Huang	8df2271324	Constitutional AI recipe (#108 ) * cai * add training configuration * update readme * Update recipes/cai/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/cai/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/cai/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/cai/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/cai/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * rename * update * rename * Quick change --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-02-01 07:02:19 -08:00
Traun Leyden	5ad6db0c79	Fixes #96 by handling RepositoryNotFoundError (#97 ) * Fixes #96 by handling RepositoryNotFoundError * Update src/alignment/model_utils.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Remove redundant code * Add unit test * Reformat file * make style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-02-01 15:47:14 +01:00
Nathan Azrak	ad3d43aeea	Make peft `bnb_4bit_compute_dtype` consistent with `torch_dtype` (#107 ) Co-authored-by: Nathan Azrak <nazrak@atlassian.com>	2024-01-29 11:59:15 +01:00
Nathan Azrak	de7d8883cd	Add check before inserting system message (#106 ) * add check before inserting system message * change in-place for consistency * fix unit test --------- Co-authored-by: Nathan Azrak <nazrak@atlassian.com>	2024-01-29 11:56:24 +01:00
Edward Beeching	cbcb3f60fb	DPO/IPO/KTO ablations (#104 ) * adds configs and readme * cleaning config files * fix typos and removes things from config * updates text to use comparisons rather ablations * fix readme and adds launch script * fix launch script, adds blogpost link * bump release version, added missing dep, fixes configs * updates main readme file	2024-01-18 14:55:00 +01:00
lewtun	c74ed11171	Bump lower version of huggingface_hub (#95 ) * Bump lower version of huggingface_hub * Fix dep	2024-01-11 23:09:48 +11:00
lewtun	f0ffa0d7a6	Update Zephyr configs to account for UltraFeedback & TRL fixes (#88 ) * Add files * Add checkpointing * Add checkpointing to SFT * Add loss type * Fix setup\| * Clean SFT * Add lora config * Rename config * Remove max eval samples * Add kwargs tp push to hub * Add DPO configs * Fix dpo configs * Extend chat template test to multi-turn * Add warmup * Refactor * Fix LoRA -> QLoRA * Fix configs * Specify chat template * Add sample logging * Fix push to hub hanging * Add reentrant * Fix quality * Add transformer logging * Tweak grad acc * Add null type * Add doc	2024-01-10 17:42:24 +11:00
Nathan Azrak	c69ae4b8a5	Check that `default_chat_template` is also None (#83 ) * Check that `default_chat_template` is also None before overwriting chat template * add unit test to `get_tokenizer` to ensure default behaviour of chat template is not changed --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-01-08 17:54:23 +11:00
Kirill	98fe28fb14	Clean deprecated max_samples arguments (#89 )	2024-01-05 09:06:47 +11:00
Evgenii Zheltonozhskii	e316174e1c	Add warmup to config (#71 ) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-01-04 16:04:46 +11:00
NielsRogge	57508b5c2d	Make SFT script consistent with DPO script (#86 ) * Add argument * Make scripts consistent * Fix style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-01-04 15:55:58 +11:00
Lewis Tunstall	8f6e5b666b	Bump dev version v0.3.0.dev	2024-01-04 01:39:11 +00:00
Scott Fleming	61a11a5c7d	Update docstring for `data.py` to reflect true behavior of `shuffle` parameter (#60 ) * Update data.py The docs state that the `shuffle` parameter in `mix_datasets` from `data.py` controls `Whether to shuffle the training data`, but then in the code if `shuffle` is set to `True` it also shuffles the test data. This small change makes the functionality consistent with the docstring. (If you instead want to keep the functionality the same, then we should update the docstring). * Update data.py Reverted to the original code structure but updated the docstring. * Update docstring in `get_dataset` and `mix_datasets` Updated docstrings to reflect the fact that `shuffle` being set to `True` leads to shuffling of both the training and testing/validation data.	2023-12-06 10:44:17 +01:00
lewtun	1c06e4e5e1	Update doc CI (#64 )	2023-12-05 12:31:30 +01:00
Nathan Azrak	3f368a0748	Add check that parameters are not intended to be offloaded (#51 ) * Add check that parameters are not intended to be offloaded * Only push model to device if quantization config is set.	2023-12-04 09:10:41 +01:00
Dragan Milchevski	15279e7157	Allow loading datasets from disk using `load_from_disk` method. (#53 ) * feat: Allow loading datasets from disk using `load_from_disk` method. * Fixing the type of error being catched.	2023-12-01 11:05:35 +01:00
Dragan Milchevski	80e952ec47	Allow running DPO from a local model (#49 ) * Update model_utils.py Check if a model is adapter model when a local path is supplied instead of HF model * Cleaner solution, thanks to lewtun	2023-11-27 11:31:09 +01:00
Thomas Capelle	f025057ce4	Missing config params on SFT (#31 ) * fix warmup with total number of steps * Explicitely tell to use 80GB Gpus * Revert "fix warmup with total number of steps" This reverts commit 760e477efdbf7f67be766a0d43b0c3b2ac26947a.	2023-11-21 12:00:09 +01:00
Alvaro Bartolome	c9d9035f95	Fix `apply_chat_template` function for `dpo` and unknown `task` (#30 ) * Fix `apply_chat_template` function for `dpo` and unknown `task` * Remove duplicated `# coding=utf-8` * Manually run `black --line-length 119`	2023-11-21 11:47:21 +01:00
Girraj Jangid	7d6fe765ec	Update README.md (#35 ) update installation instruction. Added git cmd	2023-11-20 08:52:16 +01:00
Alvaro Bartolome	0e09b0c6ec	Fix note syntax highlighting in `README.md` (#20 )	2023-11-15 08:45:27 +01:00
lewtun	a1afb2bbd4	Fix image alignment (#19 )	2023-11-12 15:47:10 +01:00
lewtun	4c6226bc42	Add moar explanations (#18 )	2023-11-12 15:43:39 +01:00
Kashif Rasul	4b0c1fe170	fix typos (#17 )	2023-11-12 13:44:50 +01:00
lewtun	43f52224db	Merge pull request #14 from sebastianschramm/ses/fix_typos_zephyr_recipe Resolves #13 fix typo in zephyr recipe readme	2023-11-10 17:06:35 +01:00
Sebastian Schramm	d48a4a477b	Resolves #13 fix typo in zephyr recipe readme	2023-11-10 16:57:39 +01:00
lewtun	e4f98e7d8f	Merge pull request #11 from huggingface/zephyr-recipe Code release	2023-11-10 15:54:13 +01:00
lewtun	363e29ff95	Apply suggestions from code review Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2023-11-10 15:49:00 +01:00
Lewis Tunstall	f5e70fbf9e	Add licenses	2023-11-10 14:47:54 +00:00
Lewis Tunstall	5a630a1989	Add QLoRA command	2023-11-10 13:57:52 +00:00
Lewis Tunstall	e2e8ab945d	Refactor imports	2023-11-10 13:38:45 +00:00

1 2

98 Commits