Commit Graph

  • 4a65eedc92 chore: memory updates, diag_pairs_compare script wassname 2026-06-09 02:42:56 +00:00
  • ec88ba3e42 merge: resolve RESEARCH_JOURNAL conflict (keep both HEAD + remote Modal-port entry) wassname 2026-06-09 02:27:08 +00:00
  • 0f59b1351b feat: online_stats gate for routeV -- live q5/q95 band calibration wassname 2026-06-09 02:25:37 +00:00
  • 0412dc56d1 results.md: fix regenerate ref (just results-deploy -> just results) wassname 2026-06-09 01:51:28 +00:00
  • 5007c9757a results: just results = eval2 deploy table (time/headline/deploy/arm/pair/seed/train/argv); hard eval2 cutoff; archive eval1 (Q1-Q13 + 352 old logs) wassname 2026-06-09 01:50:42 +00:00
  • 824b7eb623 results: Q14 complete eval2 deploy table (4 done: per-token/authored/prog_wide/random-V; via just results-deploy). Corrects earlier claim that job8 prog_wide had no eval2 deploy wassname 2026-06-08 23:57:42 +00:00
  • e26f5fe08c results: add Q14 -- routeV deploy on recency-clean eval2 (job 15 in; vanilla/act_vote/lora/random-V pending) wassname 2026-06-08 22:58:34 +00:00
  • c721c460a4 journal: act_vote routes late where grad-cosine cliffs (killed-run partial data) wassname 2026-06-08 22:50:09 +00:00
  • 1cb5900de5 journal: pairs comparison (job 17) -- authored_all precision-best (p@10 0.70 vs prog_wide 0.20) wassname 2026-06-08 19:44:00 +00:00
  • 744d851861 journal: job 15 precision-best authored per-rollout finished, deploy hack 0.076 (vanilla pending) wassname 2026-06-08 19:38:48 +00:00
  • d497bfd161 feat: act_vote routeV gate (global activation-vote routing arm) wassname 2026-06-08 15:08:28 +00:00
  • eedf9efb51 pairs: de-confound v2 (print(==) vs assert, line-matched) + intent designs (think/funcname/concept) wassname 2026-06-08 13:08:47 +00:00
  • 35be877fc0 pairs: v2 (harder/verbose) + --pairs option; NEGATIVE -- better pairs don't close the 0.67->0.84 gap wassname 2026-06-08 11:53:48 +00:00
  • 9c630b83c7 agents: no-cheat #4 (on-distribution pairs = labeling live rollouts = cheating); journal ideal-ceiling tables wassname 2026-06-08 11:39:27 +00:00
  • c29016079a diag: add top15/top05 filters, module-vote (per-space cos*|D_m|), ideal-direction ceiling wassname 2026-06-08 11:35:50 +00:00
  • c0a4e4e060 diag: 3 filter levels (all/keep75/top25); act-cosine improves monotonically (top25 AUROC 0.72, p@10 0.50) wassname 2026-06-08 11:16:48 +00:00
  • 5fe22a1973 journal: separability sweep (act>grad AUROC, grad-cos best tail p@10=0.70, magnitude inverted, distshift root cause) wassname 2026-06-08 11:13:02 +00:00
  • 80e82f0b29 diag: pinning separability sweep (grad/act x cos/proj/mag x filter), AUROC+p@k, notebook wassname 2026-06-08 11:11:55 +00:00
  • b28b1a5e88 results: deploy-eval table (eval2 headline=solve_dep-hack_dep); journal interim read wassname 2026-06-08 10:47:38 +00:00
  • fcac80c4bb journal: random-V control matches real-V at per-rollout (0.101==0.101) -- H2 absorption lead wassname 2026-06-08 08:26:26 +00:00
  • cf05310130 journal: dir6 real-V arms land (margin band) -- both suppress, per-token>per-rollout wassname 2026-06-08 02:08:02 +00:00
  • 826aa911f7 Update README.md wassname (Michael J Clark) 2026-06-08 07:48:19 +08:00
  • 0d22ee6476 writeup: fill contrastive pairs TODO with actual pair examples + loophole hacks wassname 2026-06-08 07:02:38 +08:00
  • 376dccdd7f writeup: add main.qmd (Quarto draft) + nips-template.tex; update human journal wassname 2026-06-08 07:00:54 +08:00
  • 012983fb8d docs: journal entry 2026-06-07 -- Modal routeV deadlock was stdout buffering artifact wassname 2026-06-08 06:50:16 +08:00
  • 34ba631e7d journal: deferred idea -- half-solve teacher pool to decouple off-policy/teacher-forcing confound; first-15-step gating wassname 2026-06-07 22:39:01 +00:00
  • caa0d09472 broad: TEACHER_RT -> dense pool (was sparse, under-seeds); log: rename table cols train/deploy (drop 'knob') wassname 2026-06-07 22:12:00 +00:00
  • 484305d7b4 config+log: fast defaults (dense pool, grad_clip=500); end-of-run tail = argv + hack/solve table + solve-hack objective wassname 2026-06-07 22:05:46 +00:00
  • eeee7db65c journal: routeV margin band (p75/p75) verified routing 28.7% on real 4B; dir6 restarted on it wassname 2026-06-07 14:15:12 +00:00
  • d9ea20baa4 routeV: margin (p75 clean / p75 hack) routing band, route the confident tail wassname 2026-06-07 13:42:20 +00:00
  • 25ac3fc5e3 log: routeV routing as keep/resid/rout zones x unit+energy views; drop dead hk_abl/slv_abl wassname 2026-06-07 13:13:01 +00:00
  • b170b969e2 log: surface absolute band edges (mean lower/upper), not just width wassname 2026-06-07 12:43:34 +00:00
  • 041f9319f9 fix: hkgap legend said 'mean' but band uses max-hack/min-clean (train.py:345) wassname 2026-06-07 12:41:05 +00:00
  • 5fd980244b docs: note SGTM is the latest gradient-routing paper (same authors) wassname 2026-06-07 11:56:58 +00:00
  • 637f9388c8 docs: cite SGTM paper in AGENTS.md (absorption/leakage vocab source) wassname 2026-06-07 11:40:40 +00:00
  • c449273357 log: rename routeV gauges to paper vocab (qE->absorb, resid->leak), drop 'FREE' aside wassname 2026-06-07 11:25:27 +00:00
  • 3200771042 fix: dense run_tests teacher pool (6 -> 215 prompts) so the hack seeds in 60 steps wassname 2026-06-07 10:54:32 +00:00
  • 89eaa0866b paper: record in-sample teacher-seeding method in setup section wassname 2026-06-07 10:37:28 +00:00
  • 52619519dc docs: drop dead refs (spec.md link, verify_gate_anchor.py paragraph) wassname 2026-06-07 10:20:27 +00:00
  • 1228e1b784 refactor: drop shadowed-import + duplicate-definition cruft (-91 LOC) wassname 2026-06-07 08:57:07 +00:00
  • 15a796c542 chore: gitignore modal/results; point AFK_CHECK at requeued task #1 wassname 2026-06-07 08:44:05 +00:00
  • cc8db051ab fix: seeded-shuffle train pool (was first-200-by-id = easy/memorized); add queue-dir6/queue-broad recipes wassname 2026-06-07 08:27:39 +00:00
  • ea01267cd8 fix: eval on paper test set, not contaminated holdout (base solve 0.94->0.094) wassname 2026-06-07 08:18:31 +00:00
  • a776db0ec0 vscode: drop peacock color customizations block wassname 2026-06-07 11:12:03 +08:00
  • 7da54f1967 eval+env: single-mode run_tests, held-out val/test eval, both hack metrics wassname 2026-06-07 03:07:14 +00:00
  • 7195d19f90 docs wassname 2026-06-07 02:04:27 +00:00
  • 5419771d70 modal: there was no routeV hang -- it was local stdout buffering wassname 2026-06-07 10:39:41 +08:00
  • d96367ca5d modal: mount leetcode data from image; correct 2873b37 hang claim wassname 2026-06-07 09:45:17 +08:00
  • 2873b37842 modal: flash_attention_2 + transformers==5.10.2, drop sdpa workaround wassname 2026-06-07 08:41:11 +08:00
  • 54a4298a35 modal: pin transformers to released >=5.8.0 (no floating @ main) wassname 2026-06-07 08:14:22 +08:00
  • 2f91561269 modal/train: VGROUT_ATTN attn-impl override (NOT a fix for the modal hang) wassname 2026-06-06 16:42:12 +00:00
  • 98ceb38815 modal: rename launch entrypoint main->fanout (collides with app.py::main) wassname 2026-06-06 14:09:35 +00:00
  • 6567f6c60a modal: launch.py -> 15-run v2 keynote set (5 arms x seeds 42/41/43) wassname 2026-06-06 14:07:47 +00:00
  • a3ac381724 memory: correct pi --mode json gotcha (blocks on stdin, fix is </dev/null) wassname 2026-06-06 13:48:59 +00:00
  • b8efd42d2f eval: train/test token gap for all 4 modes (lenient disjoint families) wassname 2026-06-06 13:48:59 +00:00
  • dcd1b18303 eval: train/test token gap for all 4 modes (paper memorization control) wassname 2026-06-06 13:03:37 +00:00
  • ba46e85f55 eval: 1 sample/prompt, periodic 32 distinct, final on whole pool wassname 2026-06-06 12:46:59 +00:00
  • 70aa6aa96b modal: parallel GRPO sweep port (image, volume, fan-out launcher) wassname 2026-06-06 20:30:19 +08:00
  • bcf09dd742 docs wassname 2026-06-06 12:27:26 +00:00
  • 842a373ebc seed periodic deploy eval too (common random numbers, RNG save/restore) wassname 2026-06-06 12:25:25 +00:00
  • 73936c822f rename route2->routeV; heavy seeded final eval; save delta_S_hack wassname 2026-06-06 12:08:28 +00:00
  • 9c76584970 track pairsets in git (hand-authored supervision source) wassname 2026-06-06 08:11:01 +00:00
  • 4b9545c59a spec: route2b is the method, drop erase; workshop = 1 method + vanilla baseline + random-V ablation wassname 2026-06-06 05:20:00 +00:00
  • 69f8bc208d justfile: erase recipes use the prog_wide default (drop pinned --v-hack-path) wassname 2026-06-06 05:10:29 +00:00
  • f22b69d1d3 config: make prog_wide (30 pairs) the default vhack_pairs_path wassname 2026-06-06 05:02:08 +00:00
  • dd922d8793 route2: add per-token routing granularity (route2_per_token), default per-rollout wassname 2026-06-06 04:52:30 +00:00
  • aca045ec99 route2: surface routed-fraction (frout) col + fix stale tau/hkgap legends wassname 2026-06-06 04:48:17 +00:00
  • d159d4c0f2 route2: fail loud if real v_grad band collapses (extraction broken) wassname 2026-06-06 03:35:33 +00:00
  • 485839d7b1 route2: pair-calibrated banded gate, drop live-detector tau + force-route wassname 2026-06-06 03:27:24 +00:00
  • d131323a8d spec: full rewrite as self-contained handoff (main.tex jargon, complete pseudocode) wassname 2026-06-06 03:05:08 +00:00
  • 83cae4ef72 docs: reframe no-cheat in VECTOR terms; move it README->AGENTS.md wassname 2026-06-06 02:39:48 +00:00
  • a83953131e spec: drop live-detector validation; per-rollout granularity (paper-backed) + cheap label-free diagnostics wassname 2026-06-06 02:23:58 +00:00
  • 180d3e862c spec: banded cosine gate (lower/upper from pair clean/hack cosines) + live-A calibration validation wassname 2026-06-06 02:16:38 +00:00
  • 53d88bc9ee spec: fold external-review into pair-routing plan; default teacher_off_step=30 wassname 2026-06-06 01:03:02 +00:00
  • dfdc538428 spec: pair-routing impl plan + resume-after-compaction state wassname 2026-06-06 00:10:23 +00:00
  • 68b0624733 backup: pueue job manifest (94 jobs, id/status/label/argv) at routing-refactor wassname 2026-06-06 00:01:58 +00:00
  • 0fa250b193 handoff: pre-routing-refactor snapshot + diagnosis wassname 2026-06-05 23:58:35 +00:00
  • f82a4f034d paper: interim directionality fig (app:directionality) + confound TODO wassname 2026-06-05 23:40:02 +00:00
  • 329066e99b paper: teacher-off control appendix (app:teacher) -- teacher seeds not sustains wassname 2026-06-05 12:30:49 +00:00
  • ac418a54ce journal: #186 teacher-off vanilla hacking self-sustaining (job 87, 0.36->0.58 on-policy) wassname 2026-06-05 12:07:41 +00:00
  • 6dd6b74e73 afk: lite hourly check (one cron at :23, no deep dive unless broken) wassname 2026-06-05 10:35:58 +00:00
  • 7eac7750dc afk: add docs/AFK_CHECK.md (scopes hourly check to directionality mystery) wassname 2026-06-05 09:46:38 +00:00
  • d2b0fcb255 afk: scope hourly check to directionality mystery (docs/AFK_CHECK.md); drop routine no-finding journal entry (h) wassname 2026-06-05 09:46:24 +00:00
  • 6f60ebafa1 journal (h): AFK check -- no-cheat E-by-mode table re-confirmed on job 95; directionality framing corrected wassname 2026-06-05 09:35:27 +00:00
  • a3a3f09824 retract 'null_city contaminated' framing -> in/out-of-subspace + cosine-is-correlational wassname 2026-06-05 09:21:41 +00:00
  • e5295dc07b feat: route2 Haar-random v_grad directionality control (H2 vs H4) + semantic placebo fleet wassname 2026-06-05 08:43:54 +00:00
  • ec00bc4383 docs: A5 leak is double-hacks (not detector FP); placebo non-directionality measured via hkgap wassname 2026-06-05 08:23:49 +00:00
  • 8249a9691e fix: ship smoke fixtures so the gate runs on a fresh clone wassname 2026-06-05 07:13:33 +00:00
  • 55937a86fb rename python package projected_grpo -> vgrout wassname 2026-06-05 14:51:02 +08:00
  • 03693e4f30 name the method vGROUT (vector gradient routing) wassname 2026-06-05 14:45:11 +08:00
  • 07e1eb8753 paper: fix build, vector figs, +2 plots, de-jargon prose wassname 2026-06-05 14:25:03 +08:00
  • 04562c5226 doc: fix stale tab:ablation provenance — random-V is job 106 not 87 wassname 2026-06-05 05:59:28 +00:00
  • 08ed96292f fig: point keynote includegraphics at tracked out/figs PNG (drop gitignored symlink) wassname 2026-06-05 05:20:55 +00:00
  • 3ae1e8376d journal: close (a) WATCH — placebo endpoint refutes route directionality (job 86) wassname 2026-06-05 05:01:18 +00:00
  • 273c9ae4aa Merge branch 'probe/distill-cosine' of https://github.com/wassname/projected_grpo into probe/distill-cosine wassname 2026-06-05 04:52:47 +00:00
  • 562832acec test: no-cheat partition + teacher-pool composition gate (verify_partition.py) wassname 2026-06-05 04:36:03 +00:00
  • 5242f66b7e figs: a5 dedup title->axis arrow + CSV, overlay onset dot->labeled vline wassname 2026-06-05 04:13:37 +00:00
  • 8daf58d25e figs: a5 vanilla->route arrows, equiv0->approx0, skip degenerate train_deploy, prune orphans wassname 2026-06-05 04:08:58 +00:00
  • f0cbbacaf0 save per-eval deploy-adapter ckpts (rescore w/o retrain) + CLAUDE.md test lesson wassname 2026-06-05 03:58:26 +00:00
  • 7b08a7ede9 journal: A5 gate leak fixed (teacher-only anchor) + airtight rerun queued (job 111) wassname 2026-06-05 03:54:09 +00:00