mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-06-27 16:10:30 +08:00
Train using supervised examples
Requirements
wandb
evaluate
datasets
transformers
torch
Start training reward model
python trainer.py --configs defaults galactica-125
Dataset
For now we only support webgpt and summary dataset from OpenAI. Once open-asisstant dataset are available it will be added here.
Model
TBD
Results
Experimental results in wandb here.
TODOS
- decide on a model
- add special token to declare prompt and reply. Do nto freeze the weights for these
- Merge utils etc with reward model