mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 18:04:59 +08:00
53d88bc9ee
External review (Claude + deepseek-v4-pro) converged on the threshold being circular (c_rej>c_cho holds by construction since vec=mean(g_rej-g_cho)) plus scale-mismatched to live rollouts. Decisions added: leave-one-pair-out as the real vec-generalizes diagnostic; quantile-tau to match flagged fraction in the real-vs-random control; route the vec-component (erase-style) not the whole rollout; degeneracy diagnostic (hkgap collapse); pre-register the science UAT (n>=3 seeds, effect>random-baseline std). teacher_off_step now defaults to 30 on the base Config so every arm runs pure on-policy past step 30 (apples-to-apples deploy numbers; job 87 showed hacking self-sustains after the cut). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>