Alvaro Bartolome
|
70769f9e9b
|
Add run_orpo.py (#143)
* Add `ORPOConfig`
* Add `task=orpo` and support `(prompt,chosen,rejected)` datasets
* Add missing `model_init_kwargs` and `dataset_num_proc`
* Add `run_orpo.py` (WIP)
* Update `trl` dependency from source
* Add `setup_chat_format` before `apply_chat_template`
* Add `config_full.yaml` for `mistral-7b-orpo`
* Fix comment indentation
* Use `chat_template=chatml` instead
* Add `kaist-ai/mistral-orpo-capybara-7k` recipe
* Rename `DPOTrainer` to `ORPOTrainer` in `config_full.yaml` files
* Run `black --line-length 119 src`
* Add `is_openai_format` to fix `(prompt,chosen,rejected)` formatting
* Run `black --line-length 119 src`
* Fix `isort` in `run_orpo.py`
* Update `mistral-capybara/orpo/config_full.yaml`
* Check if `test` is available split
* Pin `trl` to `alvarobartt/trl` fork (debugging)
* Add `qwen-capybara` recipe
* Update `mistral-capybara` recipe
* Set `add_generation_prompt=True` if `task="orpo"`
* Reduce `logging_steps` to 10
* Unset `add_generation_prompt` when `task=orpo`
* Add filtering based on prompt length
Done similarly to the original implementation, in order to better reproduce their results
* Fix prompt length filtering
* Update `trl` pinned version
* Remove extra outdate config files
* Update `recipes/mistral-capybara/orpo/config_full.yaml`
* Run `make style`
* Activate BEAST MODE
* Pin deps
* Add readme
* Fix dep
---------
Co-authored-by: Lewis Tunstall <lewis.c.tunstall@gmail.com>
|
2024-04-11 16:02:20 +02:00 |
|