mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 19:15:20 +08:00
0ea751c5bc
New scripts/plot_substrate.py parses the hk_<mode> cumulative columns from a multi-loophole substrate run (one log, K interleaved modes) and draws one learning curve per mode with first_step onset dots and direct end-labels. plot_emergence.py can't do this (it groups logs by a single --env-mode). Figure shows the headline: vanilla GRPO learns file_marker/run_tests/ stdout_marker/sentinel, eq_override flat at 0 (never). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>