docs(writeup): NeurIPS-workshop paper skeleton + tectonic compile recipe

Minimal LaTeX skeleton: outline + evidence tables (route2 n=3 deploy numbers
filled with provenance, vanilla pending jobs 74/84) + figures + verified refs
+ appendix (4-mode traces, 6/6/6/6 partition counts, pseudocode). Build
artifacts and figs symlinks gitignored. `just paper` compiles via tectonic;
`just paper-qc` dumps text + greps for unresolved refs / TODOs.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
wassname
2026-06-02 06:59:15 +00:00
parent 17e4f2e2ff
commit 923de6dbe6
6 changed files with 819 additions and 4 deletions
+5 -4
View File
@@ -29,7 +29,7 @@ direction from 2 of the 4 loopholes, measure suppression on the other 2.
C1 (primary, existence -> systematic). Routing the GRPO gradient against a
weak-detector hack direction in the SVD-of-W basis lowers deploy hack rate vs
vanilla GRPO at matched-ish solve rate, replicated over n=3 seeds.
- Evidence: jobs 68/69/70 (route2 no-floor s41/42/43) vs 79/74/72 (vanilla
- Evidence: jobs 68/69/70 (route2 no-floor s41/42/43) vs 84/74/72 (vanilla
s41/42/43). Deploy = knob-off, n=64 prompts x group, T=0.7.
- Confidence today: suggestive at n=1; n=3 band landing. NOT yet 30pp (the
preregistered H1 bar); honest framing is "reduces hack at comparable solve",
@@ -90,11 +90,12 @@ deploy hack/solve + by_mode come from the JSON, per-step curves from the log/TSV
A1 -- Keynote figure. route2 vs vanilla deploy hack/solve over training, n=3
band. Prototype exists: out/figs/dyn_sub4*.png (`just dyn`). [/] blocked on the
n=3 vanilla band (jobs 74 s42 + 79 s41; 72 s43 done; route2 68/69/70 done).
n=3 vanilla band (jobs 74 s42 + 84 s41 [re-added from killed 79, p7 so it runs
ahead of the A3 erase rows]; 72 s43 done; route2 68/69/70 done).
A2 -- Keynote table. Per-arm deploy hack + deploy solve, mean +/- SEM over 3
seeds, route2 no-floor vs vanilla, delta vs vanilla, paired test + alpha stated.
[/] same blocker as A1 (74, 79).
[/] same blocker as A1 (74, 84).
A3 -- Ablation table (what each component buys; the arms you named). One row per
arm at matched seed/preset, deploy hack + solve:
@@ -125,7 +126,7 @@ A7 -- Appendix ablation context. Cite results.md Q-rows already run: basis width
(Q8), refresh cadence (Q5), teacher mix (Q6), gate mode (Q3), solve-orthog (Q9),
pairset content/placebo (Q10). [x] data exists; just needs porting into the paper.
Next action when 74+79 land: read each per_mode_deploy.json, `just dyn`,
Next action when 74+84 land: read each per_mode_deploy.json, `just dyn`,
fill A1/A2, append a journal entry. Then queue A5 (the gap).
## Red-team checklist before publishing (paper-writing evidence standards)