9 Commits

Author SHA1 Message Date
theblackcat102 bcebbbc49c [merge] Fix conflict 2023-02-11 00:23:25 +00:00
jack.butler 24b07523aa add test for tokenizer matching behaviour 2023-02-10 09:47:20 +00:00
jack.butler 090c5cbcc2 fix tokenizer matching and add tests 2023-02-09 18:47:38 +00:00
theblackcat102 1041564db7 [feature] mix generation from different tasks 2023-02-03 00:15:29 +00:00
theblackcat102 62a203fd8c [feature] move data formatting into dataset, instead of collator 2023-01-21 03:31:35 +00:00
theblackcat102 aca3e9de89 [fix] wait it pass? 2023-01-20 07:26:26 +00:00
theblackcat102 74cb9aaa5a [feature] added translation, rallio instruct tuning dataset, prosocial for safety, new summary dataset 2023-01-20 03:02:07 +00:00
theblackcat102 1546111094 [feature] added GSM8k and code refactoring 2023-01-14 06:24:47 +00:00
theblackcat102 3966024871 [fix] Fix summarizer bug and QA typo issue 2023-01-14 05:49:22 +00:00