mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-07-05 17:30:48 +08:00
222059b1b2
Add anthropic RLHF dataset & deepspeed support for reward model