alignment-handbook

mirror of https://github.com/wassname/alignment-handbook.git synced 2026-06-27 17:29:09 +08:00

Author	SHA1	Message	Date
lewtun	c74ed11171	Bump lower version of huggingface_hub (#95 ) * Bump lower version of huggingface_hub * Fix dep	2024-01-11 23:09:48 +11:00
lewtun	f0ffa0d7a6	Update Zephyr configs to account for UltraFeedback & TRL fixes (#88 ) * Add files * Add checkpointing * Add checkpointing to SFT * Add loss type * Fix setup\| * Clean SFT * Add lora config * Rename config * Remove max eval samples * Add kwargs tp push to hub * Add DPO configs * Fix dpo configs * Extend chat template test to multi-turn * Add warmup * Refactor * Fix LoRA -> QLoRA * Fix configs * Specify chat template * Add sample logging * Fix push to hub hanging * Add reentrant * Fix quality * Add transformer logging * Tweak grad acc * Add null type * Add doc	2024-01-10 17:42:24 +11:00
Nathan Azrak	c69ae4b8a5	Check that `default_chat_template` is also None (#83 ) * Check that `default_chat_template` is also None before overwriting chat template * add unit test to `get_tokenizer` to ensure default behaviour of chat template is not changed --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-01-08 17:54:23 +11:00
Kirill	98fe28fb14	Clean deprecated max_samples arguments (#89 )	2024-01-05 09:06:47 +11:00
Evgenii Zheltonozhskii	e316174e1c	Add warmup to config (#71 ) Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-01-04 16:04:46 +11:00
NielsRogge	57508b5c2d	Make SFT script consistent with DPO script (#86 ) * Add argument * Make scripts consistent * Fix style --------- Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>	2024-01-04 15:55:58 +11:00
Lewis Tunstall	8f6e5b666b	Bump dev version v0.3.0.dev	2024-01-04 01:39:11 +00:00
Scott Fleming	61a11a5c7d	Update docstring for `data.py` to reflect true behavior of `shuffle` parameter (#60 ) * Update data.py The docs state that the `shuffle` parameter in `mix_datasets` from `data.py` controls `Whether to shuffle the training data`, but then in the code if `shuffle` is set to `True` it also shuffles the test data. This small change makes the functionality consistent with the docstring. (If you instead want to keep the functionality the same, then we should update the docstring). * Update data.py Reverted to the original code structure but updated the docstring. * Update docstring in `get_dataset` and `mix_datasets` Updated docstrings to reflect the fact that `shuffle` being set to `True` leads to shuffling of both the training and testing/validation data.	2023-12-06 10:44:17 +01:00
lewtun	1c06e4e5e1	Update doc CI (#64 )	2023-12-05 12:31:30 +01:00
Nathan Azrak	3f368a0748	Add check that parameters are not intended to be offloaded (#51 ) * Add check that parameters are not intended to be offloaded * Only push model to device if quantization config is set.	2023-12-04 09:10:41 +01:00
Dragan Milchevski	15279e7157	Allow loading datasets from disk using `load_from_disk` method. (#53 ) * feat: Allow loading datasets from disk using `load_from_disk` method. * Fixing the type of error being catched.	2023-12-01 11:05:35 +01:00
Dragan Milchevski	80e952ec47	Allow running DPO from a local model (#49 ) * Update model_utils.py Check if a model is adapter model when a local path is supplied instead of HF model * Cleaner solution, thanks to lewtun	2023-11-27 11:31:09 +01:00
Thomas Capelle	f025057ce4	Missing config params on SFT (#31 ) * fix warmup with total number of steps * Explicitely tell to use 80GB Gpus * Revert "fix warmup with total number of steps" This reverts commit 760e477efdbf7f67be766a0d43b0c3b2ac26947a.	2023-11-21 12:00:09 +01:00
Alvaro Bartolome	c9d9035f95	Fix `apply_chat_template` function for `dpo` and unknown `task` (#30 ) * Fix `apply_chat_template` function for `dpo` and unknown `task` * Remove duplicated `# coding=utf-8` * Manually run `black --line-length 119`	2023-11-21 11:47:21 +01:00
Girraj Jangid	7d6fe765ec	Update README.md (#35 ) update installation instruction. Added git cmd	2023-11-20 08:52:16 +01:00
Alvaro Bartolome	0e09b0c6ec	Fix note syntax highlighting in `README.md` (#20 )	2023-11-15 08:45:27 +01:00
lewtun	a1afb2bbd4	Fix image alignment (#19 )	2023-11-12 15:47:10 +01:00
lewtun	4c6226bc42	Add moar explanations (#18 )	2023-11-12 15:43:39 +01:00
Kashif Rasul	4b0c1fe170	fix typos (#17 )	2023-11-12 13:44:50 +01:00
lewtun	43f52224db	Merge pull request #14 from sebastianschramm/ses/fix_typos_zephyr_recipe Resolves #13 fix typo in zephyr recipe readme	2023-11-10 17:06:35 +01:00
Sebastian Schramm	d48a4a477b	Resolves #13 fix typo in zephyr recipe readme	2023-11-10 16:57:39 +01:00
lewtun	e4f98e7d8f	Merge pull request #11 from huggingface/zephyr-recipe Code release	2023-11-10 15:54:13 +01:00
lewtun	363e29ff95	Apply suggestions from code review Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2023-11-10 15:49:00 +01:00
Lewis Tunstall	f5e70fbf9e	Add licenses	2023-11-10 14:47:54 +00:00
Lewis Tunstall	5a630a1989	Add QLoRA command	2023-11-10 13:57:52 +00:00
Lewis Tunstall	e2e8ab945d	Refactor imports	2023-11-10 13:38:45 +00:00
edbeeching	7f1a14e0d4	adds auto adapter merge to dpo script	2023-11-10 14:15:44 +01:00
Lewis Tunstall	54185783e0	Remove QLoRa for now	2023-11-10 11:20:39 +00:00
Lewis Tunstall	edf67d1d93	Tweaks	2023-11-10 11:15:45 +00:00
Lewis Tunstall	551f901f95	Fix dep	2023-11-10 11:02:44 +00:00
Lewis Tunstall	a0b8d49424	Rename recipe	2023-11-10 10:49:13 +00:00
Lewis Tunstall	64f1834e01	Add config tests	2023-11-10 10:00:05 +00:00
Lewis Tunstall	8699f47bf3	Add jinja2 to req deps	2023-11-10 09:45:22 +00:00
lewtun	b1b0c1c8c0	Update setup.py Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>	2023-11-10 10:44:06 +01:00
Lewis Tunstall	2ed5a45d25	Add model utils tests	2023-11-10 09:42:15 +00:00
Lewis Tunstall	0af8011993	Bump deps	2023-11-10 08:41:17 +00:00
Lewis Tunstall	610a1a2de4	Add unit tests for data mixer	2023-11-10 08:37:53 +00:00
edbeeching	0f0b61c096	ups lora bs x grad_acc to 64	2023-11-10 09:30:54 +01:00
edbeeching	13141a4b0b	adds updated model paths, adds eval to sft scripts	2023-11-10 09:26:39 +01:00
Lewis Tunstall	4b0769d137	Fix links	2023-11-09 14:42:57 +00:00
Lewis Tunstall	89f58a043c	Add project structure	2023-11-09 14:40:23 +00:00
Lewis Tunstall	44b324487d	Bump bs	2023-11-09 14:20:43 +00:00
Lewis Tunstall	756bb76d22	Fix Slurm opts	2023-11-09 14:09:52 +00:00
Lewis Tunstall	33a0ce3afd	Add more doc	2023-11-09 13:39:03 +00:00
edbeeching	3a5430222e	removes need for yq dep	2023-11-09 13:04:34 +01:00
edbeeching	49da3ef739	adds configs and instructions for lora training	2023-11-09 10:56:25 +01:00
Lewis Tunstall	2de17f5ba1	Add doc	2023-11-09 07:32:24 +00:00
Lewis Tunstall	e2c19a0252	Tweak	2023-11-08 23:09:16 +00:00
Lewis Tunstall	ee10c4efd9	Make DPO work!	2023-11-08 22:58:34 +00:00
Lewis Tunstall	e54e095978	Make it work for realz	2023-11-08 22:20:17 +00:00

1 2

74 Commits