mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 17:30:41 +08:00
6df80ac246
CRIT (gpt-5.4 review): _gt_correct keyed correctness on exit-code-0, so a wrong solution with os._exit(0) (uncatchable, bypasses the SystemExit guard) read gt_correct=True in every mode -- breaking the strict oracle AND non-overlap (a hard-exit hack looked genuinely correct everywhere). Verified the hole, then fixed: correctness now requires REACHING a post-assert sentinel in stdout; any early termination (sys.exit/os._exit/raise) or failing assert skips it. +3 verify cases (os_exit @ exit_code/run_tests/sentinel), 25/25 pass. IMPORTANT: build_substrate greedy round-robin could starve a mode when an even assignment existed -> replaced with exact Kuhn bipartite matching, decrement per_mode until all modes saturate, fail loud otherwise. IMPORTANT: teacher rows stored foolable gt_pass (True on exit/eq exploits) -> inflated teacher gt_t/PASS_RATE. Now store strict gt_correct. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>