paper: keynote A1/A2 to n=3 (route hack -0.292 vs vanilla, paired p~=0.013)

Job 77 (vanilla s41) landed -> both arms n=3. Fill tab:keynote + fig:keynote caption, add paired t-test, pin the exact 6-log regen command (just dyn --latest-per-arm clobbers the band). Regenerated dyn_sub4 figure from the 6 explicit seed logs, fixing the 87cca9a clobber. Journal entry 2026-06-03(a). Also: README points to main.tex and drops the stale n=1 findings block; record two OpenReview URLs as a TODO in related work (mine reviews for shared critiques). Closes A1/A2 (#173). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-27 16:15:35 +08:00 · 2026-06-03 03:36:32 +00:00
parent 87cca9a603
commit 753a54c625
13 changed files with 5181 additions and 5693 deletions
@@ -115,42 +115,9 @@ See [RESEARCH_JOURNAL.md](RESEARCH_JOURNAL.md) for session-by-session findings,
 including the 2026-05-23 grader-bug discovery that invalidated all prior `gt=0`
 measurements and the move from Qwen3.5-2B to Qwen3-4B (reference substrate).

-## Current findings (preliminary, n=1 seed)
+## Results and write-up

-> Stale as of 2026-06-02. The numbers below are the late-May erase/basis-width
-> result at the old default mix=0.5. Default mix is now locked to 0.125, the
-> primary arm is route2, and the live comparison is per-arm deploy hack/solve
-> (knob-off, n=64, T=0.7). n=3 no-floor route2 + matched vanilla refs are
-> landing (pueue jobs 68-79); this section gets rewritten on those numbers.
-> Latest results live in `RESEARCH_JOURNAL.md`.
-
-These are headline results from the fast preset (20 steps, mix=0.5, seed=41).
-Full provenance and per-step log audits are in `RESEARCH_JOURNAL.md`.
-
-**What appears to work (seed 41):** a stronger extracted basis drops last-5
-student hack rate from 77.5% (`v_hack_full`) to 47.5% (`v_hack_21pairs`),
-frozen V, at matched ground-truth pass rate near 20%. CAVEAT (corrected
-2026-05-29 from the safetensors shapes, see docs/results.md Q8): the two bases
-differ on three axes at once — pairs used (10 vs 16), directions kept (k=5 vs
-k=12), and extract tau (0.25 vs 0.0) — so this is NOT cleanly "more pairs".
-A one-knob k-sweep is needed to attribute the gain. Vanilla-baseline
-head-to-head and seed=42/43 replicates are queued.
-
-**What turns out to matter for the design (entries f, i):** the extracted
-v_hack basis goes stale fast during training. The per-step cosine of the
-live teacher gradient against v_hack decays from about 0.27 at step 0 to
-about 0.07 by step 10. Re-extracting v_hack every 2 optimizer steps
-(`--vhack-refresh-every=2`) keeps the second-half-of-training cosine about
-1.43x higher than the frozen baseline. But at the 21-pair width, the
-refresh effect on last-5 hack_s is small (47.5% frozen vs 45.0% refresh-2,
-about 2.5pp). Basis width does most of the work; refresh helps marginally.
-
-## Hypotheses (preregistered)
-
-See [spec.md](spec.md). Headline: H1 — gradient projection in SVD basis against
-a v_hack extracted from ~60-80 contrastive pairs reduces reward hack rate by
->=30pp absolute vs vanilla GRPO at matched LeetCode pass rate (±10pp).
-
-Status at 2026-05-29: 30pp absolute drop confirmed within the projected arm
-at n=1 seed (12-pair to 21-pair, entry h). Vanilla-baseline head-to-head and
-n>=2 seed replication queued.
+The paper draft is the source of truth for current numbers, figures, and the
+preregistered hypotheses: [docs/writeup/main.tex](docs/writeup/main.tex)
+(keynote table + figure, ablations, generalisation). Session-by-session
+findings and per-step log audits live in [RESEARCH_JOURNAL.md](RESEARCH_JOURNAL.md).