Files
Open-Assistant/model/supervised_finetuning/README.md
T
Sotirios Anagnostidis c20dfaad5b pre-commits
2023-01-03 22:45:34 +01:00

39 lines
685 B
Markdown

# Train using supervised examples
Requirements
```
wandb
evaluate
datasets
transformers
torch
```
Start training reward model
```bash
python trainer.py --configs defaults galactica-125
```
## Dataset
For now we only support webgpt and summary dataset from OpenAI. Once
open-asisstant dataset are available it will be added here.
## Model
TBD
## Results
Experimental results in wandb
[here](https://wandb.ai/sanagnos/supervised-finetuning?workspace=user-sanagnos).
## TODOS
- decide on a model
- Merge utils etc with reward model
- Casual Modelling for GPT-JT does not leverage the bidirectional mask for the
prompt? (https://huggingface.co/togethercomputer/GPT-JT-6B-v1)