mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:45:42 +08:00
test: no-cheat partition + teacher-pool composition gate (verify_partition.py)
The other half of the no-cheat family (sibling of the gate-anchor leak). Asserts
on the real out/pools/substrate/partition.json: (1) partition is a clean function
into the 4 distinct substrate modes, each populated; (2) under teacher_modes={run_tests}
the kept teacher pool is ALL known-mode -- held-out modes get ZERO demos and are
genuinely held out (>0 problems). Vibe-check, not a theorem; wired into just smoke.
6/6 pass.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -28,6 +28,7 @@ results:
|
||||
smoke *ARGS:
|
||||
uv run python scripts/verify_rewards.py # grader gate: 3 env_modes x clean/hack
|
||||
uv run python scripts/verify_gate_anchor.py # route2 no-cheat gate: teacher-only anchor zeroes held-out labels
|
||||
uv run python scripts/verify_partition.py # no-cheat: partition clean + teacher_modes hands gate only known-mode demos
|
||||
BEARTYPE=1 {{ TRAIN }} smoke --intervention=erase \
|
||||
--v-hack-path=out/vhack/v_hack_smoke.safetensors \
|
||||
--teacher-pool-dir=out/pools/teacher_pool --mix-ratio=0.5 {{ ARGS }}
|
||||
|
||||
Reference in New Issue
Block a user