alignment-handbook

mirror of https://github.com/wassname/alignment-handbook.git synced 2026-07-02 00:25:22 +08:00

Author	SHA1	Message	Date
Bram Vanroy	595023faa4	Adding continued_pretraining task (#131 ) * add continued pretraining script * simplify config; add dataset_config option * add ds configs in data mixer creator * use extended sftconfig * add option to avoid setting chat template * fix data_configs bug * add continued pretraining info * add gpt2-nl recipe for continued pretraining example * add final newline * make style * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/gpt2-nl/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * rename continued pretraining to cpt * improve README --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-03-14 15:15:23 +01:00
lewtun	a9b8a50a27	🌟 (#135 ) * Add StarChat2 * Add DPO * Fix unit test * Typos * Typo	2024-03-12 17:22:21 +01:00
lewtun	ff618a4d13	🪁 (#129 ) * Add Gemma 7B recipe * Use Gemma template * Make it work for dolly lol * Enable cahce * Clean up * DPO to the max * DPO, DPO, DPO * Add openhermes * Add custom configs * Add kwargs * Fix config * Bump deps * Move old recipes * Add doc * Add norte * Renable cache * Nuke * Clean * Apply suggestions from code review Co-authored-by: Alvaro Bartolome <alvaro@argilla.io> * Fix isort * Update README.md * Update config_full.yaml --------- Co-authored-by: Alvaro Bartolome <alvaro@argilla.io> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2024-03-01 17:29:42 +01:00
Bram Vanroy	d17fd7cd3b	Add `auto_insert_empty_system_msg` config flag (#123 ) * Make system messages optional Also use the `maybe_insert_system_message` in dpo setting * add `auto_insert_empty_system_msg` flag * add `auto_insert_empty_system_msg` * add auto_insert_empty_system_msg * Update src/alignment/configs.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * make style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-02-28 20:05:44 +01:00
Nathan Azrak	de7d8883cd	Add check before inserting system message (#106 ) * add check before inserting system message * change in-place for consistency * fix unit test --------- Co-authored-by: Nathan Azrak <nazrak@atlassian.com>	2024-01-29 11:56:24 +01:00
lewtun	f0ffa0d7a6	Update Zephyr configs to account for UltraFeedback & TRL fixes (#88 ) * Add files * Add checkpointing * Add checkpointing to SFT * Add loss type * Fix setup\| * Clean SFT * Add lora config * Rename config * Remove max eval samples * Add kwargs tp push to hub * Add DPO configs * Fix dpo configs * Extend chat template test to multi-turn * Add warmup * Refactor * Fix LoRA -> QLoRA * Fix configs * Specify chat template * Add sample logging * Fix push to hub hanging * Add reentrant * Fix quality * Add transformer logging * Tweak grad acc * Add null type * Add doc	2024-01-10 17:42:24 +11:00
Scott Fleming	61a11a5c7d	Update docstring for `data.py` to reflect true behavior of `shuffle` parameter (#60 ) * Update data.py The docs state that the `shuffle` parameter in `mix_datasets` from `data.py` controls `Whether to shuffle the training data`, but then in the code if `shuffle` is set to `True` it also shuffles the test data. This small change makes the functionality consistent with the docstring. (If you instead want to keep the functionality the same, then we should update the docstring). * Update data.py Reverted to the original code structure but updated the docstring. * Update docstring in `get_dataset` and `mix_datasets` Updated docstrings to reflect the fact that `shuffle` being set to `True` leads to shuffling of both the training and testing/validation data.	2023-12-06 10:44:17 +01:00
Dragan Milchevski	15279e7157	Allow loading datasets from disk using `load_from_disk` method. (#53 ) * feat: Allow loading datasets from disk using `load_from_disk` method. * Fixing the type of error being catched.	2023-12-01 11:05:35 +01:00
Alvaro Bartolome	c9d9035f95	Fix `apply_chat_template` function for `dpo` and unknown `task` (#30 ) * Fix `apply_chat_template` function for `dpo` and unknown `task` * Remove duplicated `# coding=utf-8` * Manually run `black --line-length 119`	2023-11-21 11:47:21 +01:00
Lewis Tunstall	f5e70fbf9e	Add licenses	2023-11-10 14:47:54 +00:00
Lewis Tunstall	2ed5a45d25	Add model utils tests	2023-11-10 09:42:15 +00:00
Lewis Tunstall	967eab4cfb	Add skeleton	2023-11-08 13:21:57 +00:00

12 Commits