_validate_config rejects method-irrelevant/contradictory options before the
model load (routeV-only knobs on non-routeV, top_k>1 off grad_cosine, v_hack_path
off erase, lora adapter on unwired arms). Removes the duplicate inline lora check,
the vanilla v_hack_path warn-and-ignore (now a hard error), and the inline top_k
assert -- one canonical place. Re-extracted v_hack_smoke against the new authored
default (sha guard caught the orphaned cache). Smoke green; bad combo raises.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Config (make the design axes explicit Literal choices):
- eval: Literal[eval2,eval3] (default eval3 = 10% unhackable, deployment-like);
unhackable_frac is now a derived property; eval/unhackable_frac/pairs recorded
in deploy_test.json metadata.
- intervention gains routeV_per_token (folds the per-token bool into the arm choice).
- routeV_gate documented as the pinning axis.
- FastConfig grad_clip 500->10 (was never load-bearing); FastLoraConfig subcommand
(fast-lora) at lr=1e-4 -- the hot 3e-3 diverged lora_frozen_b (job 25, ppl 6e5 gn98 step4).
Pairs:
- delete prog_wide.json (14/30 print-without-assert contaminated; history in git);
default -> prog_wide_clean.
- rename run_tests->execute_tests in prog_wide_clean + pairs_authored so the
extraction pairs are OOD (never use the env's real grader fn name). Re-extracted
v_hack_smoke to match.
justfile: --routeV-per-token -> intervention=routeV_per_token; drop --unhackable-frac
(eval3 default); lora recipes -> fast-lora subcommand; prog_wide -> prog_wide_clean.
smoke green (erase + routeV_per_token); all 4 verify gates pass.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
route/routeV final eval now measures both endpoints at n=119 test:
knob-off (ablate_quarantine, the deploy headline) AND knob-on (trained
model as-is). Writes deploy_hack_on/deploy_solve_on/deploy_vhack_on so
the before->after quarantine move is plottable from the deploy set
instead of borrowing the val curve's different scale.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
The smoke prereqs (out/pools/substrate, out/pools/teacher_pool,
out/vhack/v_hack_smoke) are gitignored pipeline outputs that only
exist on the GPU box -- a fresh clone died at verify_partition.py on a
FileNotFoundError for partition.json. Building them from scratch needs
a real Qwen3-4B GRPO rollout (pregen-teacher), so they can't be cheaply
regenerated CPU-side. Force-add them (~2.2MB) the same way the paper
figs under out/ are already tracked, so 'just smoke' is the portable
correctness gate it's meant to be.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>