alignment-handbook

mirror of https://github.com/wassname/alignment-handbook.git synced 2026-06-27 19:46:04 +08:00

Author	SHA1	Message	Date
Kashif Rasul	95dc47218c	update API to use latest TRL (#182 ) * update API * update deepspeed * update black * remove unused import * fix typos * fix typos in readmes * fix grammer * removed as it exists in superclass * fixes in readme * Update README.md Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * Update src/alignment/configs.py Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com> * add back dataset_kwargs * use hub_model_revision in sft and dpo * fix duplicate --------- Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>	2024-07-30 09:16:25 +02:00
Chansung Park	606d2e954f	Add fsdp+qlora support (#160 )	2024-05-08 15:08:13 +02:00
Alvaro Bartolome	cf1975a7cb	Add ORPO within `README.md` files (#154 ) * Add `ORPO` within `scripts/README.md` * Fix typo in `ModelArguments.base_model_revision` * Add `ORPO` within `README.md` * Add Zephyr 141B in "News" section	2024-04-25 10:35:45 +02:00
Bram Vanroy	595023faa4	Adding continued_pretraining task (#131 ) * add continued pretraining script * simplify config; add dataset_config option * add ds configs in data mixer creator * use extended sftconfig * add option to avoid setting chat template * fix data_configs bug * add continued pretraining info * add gpt2-nl recipe for continued pretraining example * add final newline * make style * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * Update recipes/gpt2-nl/README.md Co-authored-by: lewtun <lewis.c.tunstall@gmail.com> * rename continued pretraining to cpt * improve README --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-03-14 15:15:23 +01:00
lewtun	f0ffa0d7a6	Update Zephyr configs to account for UltraFeedback & TRL fixes (#88 ) * Add files * Add checkpointing * Add checkpointing to SFT * Add loss type * Fix setup\| * Clean SFT * Add lora config * Rename config * Remove max eval samples * Add kwargs tp push to hub * Add DPO configs * Fix dpo configs * Extend chat template test to multi-turn * Add warmup * Refactor * Fix LoRA -> QLoRA * Fix configs * Specify chat template * Add sample logging * Fix push to hub hanging * Add reentrant * Fix quality * Add transformer logging * Tweak grad acc * Add null type * Add doc	2024-01-10 17:42:24 +11:00
lewtun	4c6226bc42	Add moar explanations (#18 )	2023-11-12 15:43:39 +01:00
Kashif Rasul	4b0c1fe170	fix typos (#17 )	2023-11-12 13:44:50 +01:00
Lewis Tunstall	5a630a1989	Add QLoRA command	2023-11-10 13:57:52 +00:00
Lewis Tunstall	e2e8ab945d	Refactor imports	2023-11-10 13:38:45 +00:00
Lewis Tunstall	54185783e0	Remove QLoRa for now	2023-11-10 11:20:39 +00:00
Lewis Tunstall	edf67d1d93	Tweaks	2023-11-10 11:15:45 +00:00
Lewis Tunstall	a0b8d49424	Rename recipe	2023-11-10 10:49:13 +00:00
Lewis Tunstall	756bb76d22	Fix Slurm opts	2023-11-09 14:09:52 +00:00
Lewis Tunstall	33a0ce3afd	Add more doc	2023-11-09 13:39:03 +00:00
Lewis Tunstall	2de17f5ba1	Add doc	2023-11-09 07:32:24 +00:00
Lewis Tunstall	d2900adc83	Make it work!	2023-11-08 16:31:57 +00:00

16 Commits