mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 17:15:58 +08:00
0973f9ba7c
Cross-scale (their converged full-env vs our 60-step fast surrogate) made the paper comparison directional-only and unfair on one axis. Show vanilla GRPO as the red floor anchor instead; paper numbers stay in the extracted table. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>