mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-07-01 16:50:12 +08:00
222059b1b2
Add anthropic RLHF dataset & deepspeed support for reward model