Nathan Azrak
3f368a0748
Add check that parameters are not intended to be offloaded ( #51 )
...
* Add check that parameters are not intended to be offloaded
* Only push model to device if quantization config is set.
2023-12-04 09:10:41 +01:00
Dragan Milchevski
15279e7157
Allow loading datasets from disk using load_from_disk method. ( #53 )
...
* feat: Allow loading datasets from disk using `load_from_disk` method.
* Fixing the type of error being catched.
2023-12-01 11:05:35 +01:00
Dragan Milchevski
80e952ec47
Allow running DPO from a local model ( #49 )
...
* Update model_utils.py
Check if a model is adapter model when a local path is supplied instead of HF model
* Cleaner solution, thanks to lewtun
2023-11-27 11:31:09 +01:00
Thomas Capelle
f025057ce4
Missing config params on SFT ( #31 )
...
* fix warmup with total number of steps
* Explicitely tell to use 80GB Gpus
* Revert "fix warmup with total number of steps"
This reverts commit 760e477efdbf7f67be766a0d43b0c3b2ac26947a.
2023-11-21 12:00:09 +01:00
Alvaro Bartolome
c9d9035f95
Fix apply_chat_template function for dpo and unknown task ( #30 )
...
* Fix `apply_chat_template` function for `dpo` and unknown `task`
* Remove duplicated `# coding=utf-8`
* Manually run `black --line-length 119`
2023-11-21 11:47:21 +01:00
Girraj Jangid
7d6fe765ec
Update README.md ( #35 )
...
update installation instruction. Added git cmd
2023-11-20 08:52:16 +01:00
Alvaro Bartolome
0e09b0c6ec
Fix note syntax highlighting in README.md ( #20 )
2023-11-15 08:45:27 +01:00
lewtun
a1afb2bbd4
Fix image alignment ( #19 )
2023-11-12 15:47:10 +01:00
lewtun
4c6226bc42
Add moar explanations ( #18 )
2023-11-12 15:43:39 +01:00
Kashif Rasul
4b0c1fe170
fix typos ( #17 )
2023-11-12 13:44:50 +01:00
lewtun
43f52224db
Merge pull request #14 from sebastianschramm/ses/fix_typos_zephyr_recipe
...
Resolves #13 fix typo in zephyr recipe readme
2023-11-10 17:06:35 +01:00
Sebastian Schramm
d48a4a477b
Resolves #13 fix typo in zephyr recipe readme
2023-11-10 16:57:39 +01:00
lewtun
e4f98e7d8f
Merge pull request #11 from huggingface/zephyr-recipe
...
Code release
2023-11-10 15:54:13 +01:00
lewtun
363e29ff95
Apply suggestions from code review
...
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com >
2023-11-10 15:49:00 +01:00
Lewis Tunstall
f5e70fbf9e
Add licenses
2023-11-10 14:47:54 +00:00
Lewis Tunstall
5a630a1989
Add QLoRA command
2023-11-10 13:57:52 +00:00
Lewis Tunstall
e2e8ab945d
Refactor imports
2023-11-10 13:38:45 +00:00
edbeeching
7f1a14e0d4
adds auto adapter merge to dpo script
2023-11-10 14:15:44 +01:00
Lewis Tunstall
54185783e0
Remove QLoRa for now
2023-11-10 11:20:39 +00:00
Lewis Tunstall
edf67d1d93
Tweaks
2023-11-10 11:15:45 +00:00
Lewis Tunstall
551f901f95
Fix dep
2023-11-10 11:02:44 +00:00
Lewis Tunstall
a0b8d49424
Rename recipe
2023-11-10 10:49:13 +00:00
Lewis Tunstall
64f1834e01
Add config tests
2023-11-10 10:00:05 +00:00
Lewis Tunstall
8699f47bf3
Add jinja2 to req deps
2023-11-10 09:45:22 +00:00
lewtun
b1b0c1c8c0
Update setup.py
...
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com >
2023-11-10 10:44:06 +01:00
Lewis Tunstall
2ed5a45d25
Add model utils tests
2023-11-10 09:42:15 +00:00
Lewis Tunstall
0af8011993
Bump deps
2023-11-10 08:41:17 +00:00
Lewis Tunstall
610a1a2de4
Add unit tests for data mixer
2023-11-10 08:37:53 +00:00
edbeeching
0f0b61c096
ups lora bs x grad_acc to 64
2023-11-10 09:30:54 +01:00
edbeeching
13141a4b0b
adds updated model paths, adds eval to sft scripts
2023-11-10 09:26:39 +01:00
Lewis Tunstall
4b0769d137
Fix links
2023-11-09 14:42:57 +00:00
Lewis Tunstall
89f58a043c
Add project structure
2023-11-09 14:40:23 +00:00
Lewis Tunstall
44b324487d
Bump bs
2023-11-09 14:20:43 +00:00
Lewis Tunstall
756bb76d22
Fix Slurm opts
2023-11-09 14:09:52 +00:00
Lewis Tunstall
33a0ce3afd
Add more doc
2023-11-09 13:39:03 +00:00
edbeeching
3a5430222e
removes need for yq dep
2023-11-09 13:04:34 +01:00
edbeeching
49da3ef739
adds configs and instructions for lora training
2023-11-09 10:56:25 +01:00
Lewis Tunstall
2de17f5ba1
Add doc
2023-11-09 07:32:24 +00:00
Lewis Tunstall
e2c19a0252
Tweak
2023-11-08 23:09:16 +00:00
Lewis Tunstall
ee10c4efd9
Make DPO work!
2023-11-08 22:58:34 +00:00
Lewis Tunstall
e54e095978
Make it work for realz
2023-11-08 22:20:17 +00:00
Lewis Tunstall
d2900adc83
Make it work!
2023-11-08 16:31:57 +00:00
Lewis Tunstall
967eab4cfb
Add skeleton
2023-11-08 13:21:57 +00:00
Lewis Tunstall
b9d9aa0a29
Fix style
2023-10-30 10:00:43 +01:00
lewtun
3d8570af1e
Update README.md
2023-10-26 23:21:53 +02:00
Lewis Tunstall
da5dfbe9b6
Fix tests
2023-10-26 17:50:31 +00:00
Lewis Tunstall
a28b4cfc6e
Bump dev version
2023-10-26 10:17:28 +00:00
Lewis Tunstall
1ca8add5fa
Add release details
2023-10-26 10:10:57 +00:00
Lewis Tunstall
87033c09b3
Add tests folder
2023-10-26 09:54:20 +00:00
Lewis Tunstall
1bde6a7931
Add doc builder
2023-10-26 09:41:04 +00:00