mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-07-03 17:10:10 +08:00
222059b1b2
Add anthropic RLHF dataset & deepspeed support for reward model