alignment-handbook

mirror of https://github.com/wassname/alignment-handbook.git synced 2026-06-27 18:57:58 +08:00

Author	SHA1	Message	Date
wassname	a264efaa4c	better formating	2025-06-03 22:21:18 +00:00
wassname	097e4e0b01	wip	2025-06-02 22:31:52 +00:00
wassname	880d4eda1e	chat template fix	2025-06-02 07:27:46 +00:00
Kashif Rasul	01f29c1325	remove revision (#186 )	2024-07-31 21:23:10 +02:00
Kashif Rasul	95dc47218c	update API to use latest TRL (#182 ) * update API * update deepspeed * update black * remove unused import * fix typos * fix typos in readmes * fix grammer * removed as it exists in superclass * fixes in readme * Update README.md Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * add back dataset_kwargs * use hub_model_revision in sft and dpo * fix duplicate --------- Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>	2024-07-30 09:16:25 +02:00
kykim0	a83b1f617f	Fix the logic that causes an issue with philschmid/gemma-tokenizer-chatml tokenizer (#146 ) The `setup_chat_format()` logic should not be applied to philschmid/gemma-tokenizer-chatml tokenizer, otherwise gemma models are trained w/o proper bos, eos tokens.	2024-04-09 17:02:21 +02:00
Bram Vanroy	ba7e0e4fca	Fix dataloading for cpt (#137 ) * avpid mutable parameter * do not remove text_column for cpt * fix typo * add * remove constant KEEPCOLS * update tests with columns_to_keep	2024-03-21 20:05:53 +01:00
Bram Vanroy	595023faa4	Adding continued_pretraining task (#131 ) * add continued pretraining script * simplify config; add dataset_config option * add ds configs in data mixer creator * use extended sftconfig * add option to avoid setting chat template * fix data_configs bug * add continued pretraining info * add gpt2-nl recipe for continued pretraining example * add final newline * make style * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/gpt2-nl/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * rename continued pretraining to cpt * improve README --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-03-14 15:15:23 +01:00
lewtun	a9b8a50a27	🌟 (#135 ) * Add StarChat2 * Add DPO * Fix unit test * Typos * Typo	2024-03-12 17:22:21 +01:00
lewtun	ff618a4d13	🪁 (#129 ) * Add Gemma 7B recipe * Use Gemma template * Make it work for dolly lol * Enable cahce * Clean up * DPO to the max * DPO, DPO, DPO * Add openhermes * Add custom configs * Add kwargs * Fix config * Bump deps * Move old recipes * Add doc * Add norte * Renable cache * Nuke * Clean * Apply suggestions from code review Co-authored-by: Alvaro Bartolome <alvaro@argilla.io> * Fix isort * Update README.md * Update config_full.yaml --------- Co-authored-by: Alvaro Bartolome <alvaro@argilla.io> Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2024-03-01 17:29:42 +01:00
Bram Vanroy	d17fd7cd3b	Add `auto_insert_empty_system_msg` config flag (#123 ) * Make system messages optional Also use the `maybe_insert_system_message` in dpo setting * add `auto_insert_empty_system_msg` flag * add `auto_insert_empty_system_msg` * add auto_insert_empty_system_msg * Update src/alignment/configs.py Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * make style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-02-28 20:05:44 +01:00
lewtun	f0ffa0d7a6	Update Zephyr configs to account for UltraFeedback & TRL fixes (#88 ) * Add files * Add checkpointing * Add checkpointing to SFT * Add loss type * Fix setup\| * Clean SFT * Add lora config * Rename config * Remove max eval samples * Add kwargs tp push to hub * Add DPO configs * Fix dpo configs * Extend chat template test to multi-turn * Add warmup * Refactor * Fix LoRA -> QLoRA * Fix configs * Specify chat template * Add sample logging * Fix push to hub hanging * Add reentrant * Fix quality * Add transformer logging * Tweak grad acc * Add null type * Add doc	2024-01-10 17:42:24 +11:00
Kirill	98fe28fb14	Clean deprecated max_samples arguments (#89 )	2024-01-05 09:06:47 +11:00
NielsRogge	57508b5c2d	Make SFT script consistent with DPO script (#86 ) * Add argument * Make scripts consistent * Fix style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-01-04 15:55:58 +11:00
Nathan Azrak	3f368a0748	Add check that parameters are not intended to be offloaded (#51 ) * Add check that parameters are not intended to be offloaded * Only push model to device if quantization config is set.	2023-12-04 09:10:41 +01:00
Lewis Tunstall	33a0ce3afd	Add more doc	2023-11-09 13:39:03 +00:00
Lewis Tunstall	e54e095978	Make it work for realz	2023-11-08 22:20:17 +00:00
Lewis Tunstall	d2900adc83	Make it work!	2023-11-08 16:31:57 +00:00
Lewis Tunstall	967eab4cfb	Add skeleton	2023-11-08 13:21:57 +00:00

19 Commits