mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-07-04 17:20:19 +08:00
222059b1b2
Add anthropic RLHF dataset & deepspeed support for reward model