Files
evil_MoE/scripts
wassname 4c9071cca0 A5: build held-out-mode (hack,clean) pairs from student rollouts
scripts/pairs_from_rollouts.py mirrors pairs_from_pool but sources the
student's own rollouts.jsonl and splits hack/clean by env_mode+exploited
(the per-mode weak detector). Same-prompt pairing, asserts prompt equality.
Smoke-validated: parse + classify + loud-fail paths green on smoke rollouts
(0 hacks -> 0 pairs, as expected). Unblocks A5 once job 95 harvest lands.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-03 00:59:07 +00:00
..