mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-06-27 16:10:30 +08:00
Train using supervised examples
Requirements
wandb
evaluate
datasets
transformers
torch
Start training reward model
python trainer.py --configs defaults galactica-125
Dataset
For now we only support webgpt and summary dataset from OpenAI. Once open-asisstant dataset are available it will be added here.
Model
TBD
Results
Experimental results in wandb here.
TODOS
- decide on a model
- Merge utils etc with reward model
- Casual Modelling for GPT-JT does not leverage the bidirectional mask for the prompt? (https://huggingface.co/togethercomputer/GPT-JT-6B-v1)