feat: per-mode deploy JSON artifact for every arm + queue-substrate recipe

#164: the final eval now runs for ALL arms (not just route/route2) on the
same fixed eval subset, so the all-arms overlay reads identical per-mode
numbers. vanilla/erase have no quarantine -> deploy == train (one eval);
route/route2 also run the knob-off (ablated) eval. Writes a single
per_mode_deploy.json into run_dir (arm, mask, refresh, seed + per-mode
train/deploy hack+solve) as the canonical source for the #162 overlay plot.

justfile: replace the parametrized run-substrate (which re-passed seed/steps/
refresh/mask defaults every invocation) with one explicit queue-substrate that
queues the fixed 5-arm overlay set, each arm passing ONLY its non-default flags.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
wassname
2026-05-31 14:10:20 +00:00
parent dd3b5af3db
commit 6b22dc5055
2 changed files with 61 additions and 26 deletions
+11 -4
View File
@@ -165,10 +165,17 @@ build-substrate MODES="run_tests,exit_code,sentinel":
# (per-mode hacks>0 + finite first_step) + the per-step hk_<mode> columns. mix=0.125
# is the locked default (omit to inherit it). Vanilla needs no v_hack; for an
# erase/route substrate run, add --v-hack-path explicitly.
run-substrate INTERV="none" SEED="41" STEPS="60" REFRESH="5" MASK="act":
{{ TRAIN }} fast --intervention={{ INTERV }} \
--vhack-refresh-every={{ REFRESH }} --route2-mask={{ MASK }} \
--seed={{ SEED }} --steps={{ STEPS }} --out-tag=_sub4_{{ INTERV }}_{{ MASK }}_rf{{ REFRESH }}_s{{ SEED }}
# Queue the full 5-arm substrate overlay sweep (the all-arms per-mode deploy plot,
# #162). The arm set is FIXED -- no params, no defaults repeated. seed/steps/refresh/
# mask all inherit FastConfig defaults (seed41 steps60 rf5 mask=act); each arm passes
# ONLY what differs from default (route2-grad: --route2-mask=grad). out-tag distinguishes
# the runs for the plot glob. Every arm emits out/runs/<ts>_<tag>/per_mode_deploy.json.
queue-substrate:
pueue add -w "$PWD" -o 5 -l "why: vanilla emergence reference (4-mode substrate); resolve: per-mode deploy-hack baseline for the overlay" -- {{ TRAIN }} fast --intervention=none --out-tag=_sub4_vanilla
pueue add -w "$PWD" -o 5 -l "why: erase arm (one-sided projection); resolve: per-mode deploy hack vs vanilla at matched solve" -- {{ TRAIN }} fast --intervention=erase --out-tag=_sub4_erase
pueue add -w "$PWD" -o 5 -l "why: route arm (shared-basis quarantine, rf5); resolve: deploy hack on held-out modes vs vanilla at matched solve" -- {{ TRAIN }} fast --intervention=route --out-tag=_sub4_route
pueue add -w "$PWD" -o 5 -l "why: route2 act-mask (distinct-basis quarantine); resolve: held-out deploy hack suppressed vs vanilla at matched solve" -- {{ TRAIN }} fast --intervention=route2 --out-tag=_sub4_route2_act
pueue add -w "$PWD" -o 5 -l "why: route2 grad-mask (distinct-basis quarantine); resolve: held-out deploy hack suppressed vs vanilla at matched solve" -- {{ TRAIN }} fast --intervention=route2 --route2-mask=grad --out-tag=_sub4_route2_grad
# CANONICAL plotting entrypoint for the substrate sweep. One command, four figures
# (per-mode by-method + by-hack, and the aggregate "total hacks per arm" + overlay,