theblackcat102
|
670be60ca8
|
[fix] Fix config typo
|
2023-01-14 12:17:58 +00:00 |
|
theblackcat102
|
6f6c590e57
|
[fix] Disable task specific evaluation
|
2023-01-14 06:47:21 +00:00 |
|
theblackcat102
|
1546111094
|
[feature] added GSM8k and code refactoring
|
2023-01-14 06:24:47 +00:00 |
|
theblackcat102
|
3966024871
|
[fix] Fix summarizer bug and QA typo issue
|
2023-01-14 05:49:22 +00:00 |
|
theblackcat102
|
9451aff6cc
|
[fix] @ekurtulus major logic bug in summarization
|
2023-01-14 03:49:19 +00:00 |
|
Sotirios Anagnostidis
|
c8f47eef9f
|
precommits
|
2023-01-11 22:58:17 +01:00 |
|
Sotirios Anagnostidis
|
d46ff8c4ee
|
better logging with deepspeed
|
2023-01-11 22:48:02 +01:00 |
|
Sotirios Anagnostidis
|
6438fdbe2c
|
quantization from #582
|
2023-01-11 22:44:20 +01:00 |
|
Sotirios Anagnostidis
|
4a3ea0b033
|
refactoring, now running
|
2023-01-11 22:42:04 +01:00 |
|
ekurtulus
|
5b77dd2e9f
|
better
|
2023-01-11 11:37:27 +03:00 |
|
mrcabbage972
|
d95c741ea0
|
Fixing requirements file
|
2023-01-10 20:16:02 -05:00 |
|
mrcabbage972
|
67aeed2cd7
|
Adding override of 32-bit optimization for embedding layer
|
2023-01-09 23:03:29 -05:00 |
|
mrcabbage972
|
08bdadf222
|
Adding BNB 8-bit Adam
|
2023-01-09 22:07:06 -05:00 |
|
theblackcat102
|
a1e1445de9
|
[fix] evaluation dataset is incorrect in reward-model/trainer.py
|
2023-01-08 03:42:40 +00:00 |
|
theblackcat102
|
9d05b73efc
|
[fix] syntax error and some typing issue in py38
|
2023-01-08 02:15:08 +00:00 |
|
theblackcat102
|
116045915e
|
[fix] syntax error and some typing issue in py38
|
2023-01-08 02:14:49 +00:00 |
|
theblackcat102
|
9eb401c633
|
[fix] resolve conflict from main
|
2023-01-08 01:32:39 +00:00 |
|
Szymon Ożóg
|
f304921bd5
|
Added deberta configs
|
2023-01-07 16:36:55 +01:00 |
|
Szymon Ożóg
|
d0942a3256
|
Added option to use cosine scheduler
|
2023-01-07 16:36:38 +01:00 |
|
theblackcat102
|
3625f39948
|
[feature] Add GPTJ synthetic dataset, fix reference removal regex for webgpt
|
2023-01-07 01:36:27 +00:00 |
|
Sotirios Anagnostidis
|
d3952354e2
|
pre commits
|
2023-01-06 22:09:24 +01:00 |
|
Sotirios Anagnostidis
|
148244455c
|
refactor
|
2023-01-06 21:29:38 +01:00 |
|
Sotirios Anagnostidis
|
88ee3b3264
|
merge deepspeed
|
2023-01-06 21:28:26 +01:00 |
|
Sotirios Anagnostidis
|
f2b125cbe3
|
merge
|
2023-01-06 21:24:36 +01:00 |
|
Sotirios Anagnostidis
|
91853753a8
|
conf
|
2023-01-06 18:25:20 +01:00 |
|
Sotirios Anagnostidis
|
ef02693ac9
|
quantization
|
2023-01-06 18:24:28 +01:00 |
|
theblackcat102
|
50e7472ae6
|
[fix] push fix by linter
|
2023-01-06 17:14:19 +00:00 |
|
theblackcat102
|
577b14a702
|
[fix] push fix by linter
|
2023-01-06 17:11:06 +00:00 |
|
theblackcat102
|
fc6eab9edc
|
[fix] new code complete answer and update readme for clarity
|
2023-01-06 17:05:47 +00:00 |
|
theblackcat102
|
8e30b419bf
|
[fix] new code complete answer
|
2023-01-06 16:54:46 +00:00 |
|
theblackcat102
|
b67181776a
|
[feature] add deepspeed, rallio dialogue dataset and codegen parameters
|
2023-01-06 16:47:58 +00:00 |
|
theblackcat102
|
325c97857c
|
Merge pull request #313 from bth5032/bth5032/78-blackcat-trainer
Bth5032/78 blackcat trainer
|
2023-01-05 10:50:04 +08:00 |
|
theblackcat102
|
93b2be918e
|
Merge pull request #347 from LAION-AI/sft-gptjt-qa-labels
Sft gptjt qa labels
|
2023-01-05 10:49:51 +08:00 |
|
Bobak Hashemi
|
061d621953
|
removed old precommit pragma requirement
|
2023-01-04 20:51:19 -05:00 |
|
Sotirios Anagnostidis
|
dfaa00dccc
|
gptj 8bit
|
2023-01-05 00:33:16 +01:00 |
|
Bobak Hashemi
|
da79aa04a0
|
Cleaned up default argument logic.
|
2023-01-03 21:45:16 -05:00 |
|
Bobak Hashemi
|
4569bcf354
|
fixed linting
|
2023-01-03 20:47:33 -05:00 |
|
theblackcat102
|
f7bd22246e
|
[fix] rename old summary human feedback dataset to new one
|
2023-01-04 01:00:37 +00:00 |
|
Sotirios Anagnostidis
|
c20dfaad5b
|
pre-commits
|
2023-01-03 22:45:34 +01:00 |
|
Sotirios Anagnostidis
|
525e6964e8
|
requirements
|
2023-01-03 22:06:59 +01:00 |
|
Sotirios Anagnostidis
|
3a10e9412d
|
Question-Answer special tokens
|
2023-01-03 22:02:32 +01:00 |
|
Bobak Hashemi
|
45c147362e
|
added precommit hooks and cleaned up configs for rankgen
|
2023-01-03 01:41:45 -05:00 |
|
Bobak Hashemi
|
568a42066a
|
FP32 Training Works
|
2023-01-03 00:53:07 -05:00 |
|
Sotirios Anagnostidis
|
675b0866b0
|
SFT training
|
2023-01-03 01:31:56 +01:00 |
|
Bobak Hashemi
|
34ab948ade
|
testing rankgen integration into instructor trainer
|
2023-01-01 23:30:12 -05:00 |
|
Gareth Davidson
|
7000e10bc0
|
apply pre-commit rules
|
2023-01-02 00:01:45 +00:00 |
|
Yannic Kilcher
|
4841550cd4
|
Merge pull request #212 from bitplane/prettier-markdown
Format markdown with prettier --prose-wrap=always
|
2023-01-01 22:11:11 +01:00 |
|
Gareth Davidson
|
c3c7a1701a
|
run prettier with new params
|
2023-01-01 20:57:35 +00:00 |
|
Alexander Goryunov
|
e871de693c
|
A typo in import
|
2023-01-01 21:52:38 +02:00 |
|
theblackcat102
|
8f0028bc44
|
[fix] Fix provider
|
2023-01-01 13:28:48 +00:00 |
|