43 Commits

Author SHA1 Message Date
wassname 6d128ea986 wip 2025-06-04 05:37:07 +00:00
wassname a264efaa4c better formating 2025-06-03 22:21:18 +00:00
wassname 097e4e0b01 wip 2025-06-02 22:31:52 +00:00
wassname 880d4eda1e chat template fix 2025-06-02 07:27:46 +00:00
wassname 2819dd46d0 fmt 2025-06-02 07:13:52 +00:00
wassname fc7d4ed451 configs 2025-06-02 06:20:04 +00:00
wassname 8708597941 wip 2025-06-02 05:51:13 +00:00
Loubna Ben Allal ae3f44fc7d Add Smollm2 pipeline (#205)
* add smollm2 pipeline

* update readme
2024-11-21 13:46:39 +01:00
Loubna Ben Allal 73dce0c35d Add Smollm (#194)
* add smollm

* add to news
2024-08-19 08:47:20 +02:00
Kashif Rasul 95dc47218c update API to use latest TRL (#182)
* update API

* update deepspeed

* update black

* remove unused import

* fix typos

* fix typos in readmes

* fix grammer

* removed as it exists in superclass

* fixes in readme

* Update README.md

Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>

* Update src/alignment/configs.py

Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>

* Update src/alignment/configs.py

Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>

* Update src/alignment/configs.py

Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>

* Update src/alignment/configs.py

Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>

* add back dataset_kwargs

* use hub_model_revision in sft and dpo

* fix duplicate

---------

Co-authored-by: Alvaro Bartolome <36760800+alvarobartt@users.noreply.github.com>
2024-07-30 09:16:25 +02:00
Chansung Park 606d2e954f Add fsdp+qlora support (#160) 2024-05-08 15:08:13 +02:00
Alvaro Bartolome 70769f9e9b Add run_orpo.py (#143)
* Add `ORPOConfig`

* Add `task=orpo` and support `(prompt,chosen,rejected)` datasets

* Add missing `model_init_kwargs` and `dataset_num_proc`

* Add `run_orpo.py` (WIP)

* Update `trl` dependency from source

* Add `setup_chat_format` before `apply_chat_template`

* Add `config_full.yaml` for `mistral-7b-orpo`

* Fix comment indentation

* Use `chat_template=chatml` instead

* Add `kaist-ai/mistral-orpo-capybara-7k` recipe

* Rename `DPOTrainer` to `ORPOTrainer` in `config_full.yaml` files

* Run `black --line-length 119 src`

* Add `is_openai_format` to fix `(prompt,chosen,rejected)` formatting

* Run `black --line-length 119 src`

* Fix `isort` in `run_orpo.py`

* Update `mistral-capybara/orpo/config_full.yaml`

* Check if `test` is available split

* Pin `trl` to `alvarobartt/trl` fork (debugging)

* Add `qwen-capybara` recipe

* Update `mistral-capybara` recipe

* Set `add_generation_prompt=True` if `task="orpo"`

* Reduce `logging_steps` to 10

* Unset `add_generation_prompt` when `task=orpo`

* Add filtering based on prompt length

Done similarly to the original implementation, in order to better reproduce their results

* Fix prompt length filtering

* Update `trl` pinned version

* Remove extra outdate config files

* Update `recipes/mistral-capybara/orpo/config_full.yaml`

* Run `make style`

* Activate BEAST MODE

* Pin deps

* Add readme

* Fix dep

---------

Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>
2024-04-11 16:02:20 +02:00
Sergei Bogdanov c44cb1cd1d fix: Zephyr LoRA fine-tuning fixed (#139)
Co-authored-by: svbogdanov <sergei@numind.ai>
2024-03-21 19:28:31 +01:00
Bram Vanroy 595023faa4 Adding continued_pretraining task (#131)
* add continued pretraining script

* simplify config; add dataset_config option

* add ds configs in data mixer creator

* use extended sftconfig

* add option to avoid setting chat template

* fix data_configs bug

* add continued pretraining info

* add gpt2-nl recipe for continued pretraining example

* add final newline

* make style

* Update README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update recipes/gpt2-nl/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* rename continued pretraining to cpt

* improve README

---------

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2024-03-14 15:15:23 +01:00
lewtun a9b8a50a27 🌟 (#135)
* Add StarChat2

* Add DPO

* Fix unit test

* Typos

* Typo
2024-03-12 17:22:21 +01:00
lewtun ff618a4d13 🪁 (#129)
* Add Gemma 7B recipe

* Use Gemma template

* Make it work for dolly lol

* Enable cahce

* Clean up

* DPO to the max

* DPO, DPO, DPO

* Add openhermes

* Add custom configs

* Add kwargs

* Fix config

* Bump deps

* Move old recipes

* Add doc

* Add norte

* Renable cache

* Nuke

* Clean

* Apply suggestions from code review

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>

* Fix isort

* Update README.md

* Update config_full.yaml

---------

Co-authored-by: Alvaro Bartolome <alvaro@argilla.io>
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
2024-03-01 17:29:42 +01:00
lewtun 87cc800498 Apply quantization during DPO QLoRA (#115)
* Add QLoRA fix

* Update script
2024-02-05 16:50:17 +01:00
Costa Huang 8df2271324 Constitutional AI recipe (#108)
* cai

* add training configuration

* update readme

* Update recipes/cai/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update recipes/cai/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update recipes/cai/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update recipes/cai/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* Update recipes/cai/README.md

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>

* rename

* update

* rename

* Quick change

---------

Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2024-02-01 07:02:19 -08:00
Edward Beeching cbcb3f60fb DPO/IPO/KTO ablations (#104)
* adds configs and readme

* cleaning config files

* fix typos and removes things from config

* updates text to use comparisons rather ablations

* fix readme and adds launch script

* fix launch script, adds blogpost link

* bump release version, added missing dep, fixes configs

* updates main readme file
2024-01-18 14:55:00 +01:00
lewtun f0ffa0d7a6 Update Zephyr configs to account for UltraFeedback & TRL fixes (#88)
* Add files

* Add checkpointing

* Add checkpointing to SFT

* Add loss type

* Fix setup|

* Clean SFT

* Add lora config

* Rename config

* Remove max eval samples

* Add kwargs tp push to hub

* Add DPO configs

* Fix dpo configs

* Extend chat template test to multi-turn

* Add warmup

* Refactor

* Fix LoRA -> QLoRA

* Fix configs

* Specify chat template

* Add sample logging

* Fix push to hub hanging

* Add reentrant

* Fix quality

* Add transformer logging

* Tweak grad acc

* Add null type

* Add doc
2024-01-10 17:42:24 +11:00
Evgenii Zheltonozhskii e316174e1c Add warmup to config (#71)
Co-authored-by: lewtun <lewis.c.tunstall@gmail.com>
2024-01-04 16:04:46 +11:00
Thomas Capelle f025057ce4 Missing config params on SFT (#31)
* fix warmup with total number of steps

* Explicitely tell to use 80GB Gpus

* Revert "fix warmup with total number of steps"

This reverts commit 760e477efdbf7f67be766a0d43b0c3b2ac26947a.
2023-11-21 12:00:09 +01:00
lewtun 4c6226bc42 Add moar explanations (#18) 2023-11-12 15:43:39 +01:00
Sebastian Schramm d48a4a477b Resolves #13 fix typo in zephyr recipe readme 2023-11-10 16:57:39 +01:00
Lewis Tunstall e2e8ab945d Refactor imports 2023-11-10 13:38:45 +00:00
Lewis Tunstall edf67d1d93 Tweaks 2023-11-10 11:15:45 +00:00
Lewis Tunstall a0b8d49424 Rename recipe 2023-11-10 10:49:13 +00:00
edbeeching 0f0b61c096 ups lora bs x grad_acc to 64 2023-11-10 09:30:54 +01:00
edbeeching 13141a4b0b adds updated model paths, adds eval to sft scripts 2023-11-10 09:26:39 +01:00
Lewis Tunstall 4b0769d137 Fix links 2023-11-09 14:42:57 +00:00
Lewis Tunstall 44b324487d Bump bs 2023-11-09 14:20:43 +00:00
Lewis Tunstall 756bb76d22 Fix Slurm opts 2023-11-09 14:09:52 +00:00
Lewis Tunstall 33a0ce3afd Add more doc 2023-11-09 13:39:03 +00:00
edbeeching 3a5430222e removes need for yq dep 2023-11-09 13:04:34 +01:00
edbeeching 49da3ef739 adds configs and instructions for lora training 2023-11-09 10:56:25 +01:00
Lewis Tunstall 2de17f5ba1 Add doc 2023-11-09 07:32:24 +00:00
Lewis Tunstall e2c19a0252 Tweak 2023-11-08 23:09:16 +00:00
Lewis Tunstall ee10c4efd9 Make DPO work! 2023-11-08 22:58:34 +00:00
Lewis Tunstall e54e095978 Make it work for realz 2023-11-08 22:20:17 +00:00
Lewis Tunstall d2900adc83 Make it work! 2023-11-08 16:31:57 +00:00
Lewis Tunstall 967eab4cfb Add skeleton 2023-11-08 13:21:57 +00:00
Lewis Tunstall 8197fe1b1e Update readme 2023-10-09 16:46:56 +02:00
Lewis Tunstall 8903d4aff8 Add skeleton structure 2023-08-29 09:33:26 +02:00