Commit Graph

23 Commits

Author SHA1 Message Date
theblackcat102 bcebbbc49c [merge] Fix conflict 2023-02-11 00:23:25 +00:00
sanagnos 4ba622de8e Merge branch 'main' into sft-data-sampling 2023-02-09 09:19:17 +01:00
theblackcat102 a1b90bf981 Merge branch 'main' into add-dataset 2023-02-09 01:28:42 +00:00
Mark Worrall 9faae250ce minor tidy-up 2023-02-08 20:57:13 +00:00
Mark Worrall 283df8ec84 Get working on multi-gpu 2023-02-08 20:49:25 +00:00
Mark Worrall e2caf53654 First version of single GPU sampling working 2023-02-08 08:01:04 +00:00
hyunwoongko cb722768f7 Apply pre-commit 2023-02-08 03:50:27 +09:00
hyunwoongko 44c555cad1 Add gelu fusion 2023-02-08 03:45:22 +09:00
theblackcat102 a39cbab524 [fix] transformers import error 2023-02-07 01:26:28 +00:00
theblackcat102 8b2080559c [fix] Custom collate_fn for training 2023-02-03 06:08:01 +00:00
Sotirios Anagnostidis c8f47eef9f precommits 2023-01-11 22:58:17 +01:00
Sotirios Anagnostidis d46ff8c4ee better logging with deepspeed 2023-01-11 22:48:02 +01:00
Sotirios Anagnostidis 6438fdbe2c quantization from #582 2023-01-11 22:44:20 +01:00
Sotirios Anagnostidis 4a3ea0b033 refactoring, now running 2023-01-11 22:42:04 +01:00
ekurtulus 5b77dd2e9f better 2023-01-11 11:37:27 +03:00
mrcabbage972 67aeed2cd7 Adding override of 32-bit optimization for embedding layer 2023-01-09 23:03:29 -05:00
mrcabbage972 08bdadf222 Adding BNB 8-bit Adam 2023-01-09 22:07:06 -05:00
Sotirios Anagnostidis d3952354e2 pre commits 2023-01-06 22:09:24 +01:00
Sotirios Anagnostidis 88ee3b3264 merge deepspeed 2023-01-06 21:28:26 +01:00
Sotirios Anagnostidis ef02693ac9 quantization 2023-01-06 18:24:28 +01:00
Sotirios Anagnostidis dfaa00dccc gptj 8bit 2023-01-05 00:33:16 +01:00
Sotirios Anagnostidis 3a10e9412d Question-Answer special tokens 2023-01-03 22:02:32 +01:00
Sotirios Anagnostidis 675b0866b0 SFT training 2023-01-03 01:31:56 +01:00