fix formatting to make linter happy

2026-06-29 16:30:24 +08:00 · 2023-01-03 21:22:54 +01:00
parent 1750cecb54
commit 35cc68d7f7
1 changed files with 7 additions and 9 deletions
@@ -2,7 +2,6 @@

 For discussion about usage of supervised data see issue <https://github.com/LAION-AI/Open-Assistant/issues/186>.

-
 ## Motivation

 An important part of making the assistant useful is to teach it to understand and follow instructions, and to perform large set of tasks well.
@@ -20,6 +19,7 @@ There are also supervised dialog datasets such as Blended Skill Talk or SODA. In
 is not as focused on "academic tasks" or correctness, but encourage the model to respond naturally like a person would.

 ### Promptsource
+
 - GitHub: <https://github.com/bigscience-workshop/promptsource>
 - paper: [Multitask Prompted Training Enables Zero-Shot Task Generalization](https://arxiv.org/abs/2110.08207)
 - project for preparing templates and working with them
@@ -29,31 +29,29 @@ is not as focused on "academic tasks" or correctness, but encourage the model to
  - <https://huggingface.co/datasets/bigscience/xP3mt> (with multilingual data and machine-translated prompt)
 - they trained zero-shot models (= models for following instructions in the input)
  - based on T5 architecture (encoder-decoder) called T0 family (and MT0 for multilingual)
-  - and based on GPT architecture (decoder-only) called BloomZ family 
-  - Huggingface demo: [T0](https://huggingface.co/bigscience/T0pp), [MT0](https://huggingface.co/bigscience/mt0-large), [BloomZ](https://huggingface.co/bigscience/bloomz), 
+  - and based on GPT architecture (decoder-only) called BloomZ family
+  - Huggingface demo: [T0](https://huggingface.co/bigscience/T0pp), [MT0](https://huggingface.co/bigscience/mt0-large), [BloomZ](https://huggingface.co/bigscience/bloomz),
  - GitHub repo for T0: <https://github.com/bigscience-workshop/t-zero>
  - GitHub repo for BloomZ and MT0: <https://github.com/bigscience-workshop/xmtf>

-
 ### Natural instructions
+
 - GitHub: <https://github.com/allenai/natural-instructions>
 - paper: [Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks](https://arxiv.org/abs/2204.07705)
 - they crowdsource directly the data prepared for instruction following (and learning from a few examples)
- the GitHub repo = the dataset. It contains jsons 
+- the GitHub repo = the dataset. It contains jsons
 - they trained zero-shot and in-context few-shot models (in multiple sizes):
  - mT5 architecture (encoder-decoder, multilingual pretraining)
  - Huggingface demo few-shot: <https://huggingface.co/allenai/tk-instruct-3b-def-pos>
  - Huggingface demo zero-shot: <https://huggingface.co/allenai/tk-instruct-3b-def>

-
 ### Blended Skill Talk
+
 - used by Facebook in Blenderbot project
 - HuggingFace dataset: <https://huggingface.co/datasets/blended_skill_talk>
 - example model trained on it: <https://huggingface.co/facebook/blenderbot_small-90M>

-
 ### SODA
+
 - GitHub: <https://github.com/skywalker023/sodaverse>
 - paper: <https://arxiv.org/abs/2212.10465>
-
-