Caption now states v is run_tests-only, teacher is run_tests-only, held-out
modes have hacked_E=0 so the gate is blind, they emerge on knob-on but deploy~0,
and the placebo caveat (suppression is the direction-agnostic quarantine, not v
specificity). Bar plot tags invisible zero-height bars with ≡0.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Groups per_mode_deploy.json by arm into a list, plots mean+/-std across seeds.
At n=1 (current A5: seed 41 only) no bar appears; TODO in code points at the
queued a5 seeds 42/43 (jobs 107-110) that will populate it. Bar labels show n.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Fill route2 column of tab:generalisation from job 104 per_mode_deploy.json;
regen A5 figure (add routing2 arm key to plot_deploy_overlay). All three
held-out modes drop near zero at knob-off deploy while emerging on the
knob-on path -- routing, not non-emergence. #185.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Audit of all 4 plot scripts (plot_dynamics/substrate/emergence/deploy_overlay):
- One save_fig(fig, path) helper in figs.py writes png+svg+pdf (vector for the
paper, png for the blog). All scripts call it.
- arm_label() map: reader-facing names only -- route2->route, drop 'knob'/'the
cheat' from titles and the train-vs-deploy story (adapter on/off, reward hack).
- Titles off by default (the paper/blog caption carries it); --title re-enables
for standalone research use.
- dump_data CSV now carries every plotted series; plot_dynamics --from-csv
re-renders the three figures from the committed CSV with no logs (logs/ and
out/runs/ are gitignored; out/figs/*.csv is tracked). Round-trip verified.
- Commit the regenerated dyn_sub4 figures in all 3 formats + the CSV.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
route2-act diverged (run 43): 33M kaiming A_q/B_q at delta_S's lr=3e-3 blew up
(gn 0.3->7.5 step 8, generations -> token salad, lp_t -11). Fixes:
- #167 separate quarantine lr (route2_quar_lr_scale=0.1) so the 60x-bigger fresh
LoRA isn't trained at the main-knob lr.
- #168 divergence tripwire on teacher ppl (lp_t high-water mark; abort if it
drops >5 nats for 2 steps). Relative so tiny-random smoke (flat lp_t~-11.9)
doesn't false-trip.
- #165 act-path was silent: stash cos(a,v_act) + fired-fraction in the forward,
surface as act_cos/act_fire columns (route2-act). smoke shows act_fire=0.64 =>
the cos>0 sign test over-routes (fires on most tokens, not just hack ones).
- #166 print last train generation before FINAL EVAL (coherence eyeball).
- route2 v_act/v_grad refresh was firing but silent -- now announced.
- #162 plot_deploy_overlay.py: per-mode DEPLOY overlay from per_mode_deploy.json
(honest shipped-model numbers, route2-safe). just plot-deploy.
- just plot/results hardened: parse by header name, skip non-substrate logs,
non-fatal aggregate delegation.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>