mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 17:30:41 +08:00
docs(writeup): NeurIPS-workshop paper skeleton + tectonic compile recipe
Minimal LaTeX skeleton: outline + evidence tables (route2 n=3 deploy numbers filled with provenance, vanilla pending jobs 74/84) + figures + verified refs + appendix (4-mode traces, 6/6/6/6 partition counts, pseudocode). Build artifacts and figs symlinks gitignored. `just paper` compiles via tectonic; `just paper-qc` dumps text + greps for unresolved refs / TODOs. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -29,7 +29,7 @@ direction from 2 of the 4 loopholes, measure suppression on the other 2.
|
||||
C1 (primary, existence -> systematic). Routing the GRPO gradient against a
|
||||
weak-detector hack direction in the SVD-of-W basis lowers deploy hack rate vs
|
||||
vanilla GRPO at matched-ish solve rate, replicated over n=3 seeds.
|
||||
- Evidence: jobs 68/69/70 (route2 no-floor s41/42/43) vs 79/74/72 (vanilla
|
||||
- Evidence: jobs 68/69/70 (route2 no-floor s41/42/43) vs 84/74/72 (vanilla
|
||||
s41/42/43). Deploy = knob-off, n=64 prompts x group, T=0.7.
|
||||
- Confidence today: suggestive at n=1; n=3 band landing. NOT yet 30pp (the
|
||||
preregistered H1 bar); honest framing is "reduces hack at comparable solve",
|
||||
@@ -90,11 +90,12 @@ deploy hack/solve + by_mode come from the JSON, per-step curves from the log/TSV
|
||||
|
||||
A1 -- Keynote figure. route2 vs vanilla deploy hack/solve over training, n=3
|
||||
band. Prototype exists: out/figs/dyn_sub4*.png (`just dyn`). [/] blocked on the
|
||||
n=3 vanilla band (jobs 74 s42 + 79 s41; 72 s43 done; route2 68/69/70 done).
|
||||
n=3 vanilla band (jobs 74 s42 + 84 s41 [re-added from killed 79, p7 so it runs
|
||||
ahead of the A3 erase rows]; 72 s43 done; route2 68/69/70 done).
|
||||
|
||||
A2 -- Keynote table. Per-arm deploy hack + deploy solve, mean +/- SEM over 3
|
||||
seeds, route2 no-floor vs vanilla, delta vs vanilla, paired test + alpha stated.
|
||||
[/] same blocker as A1 (74, 79).
|
||||
[/] same blocker as A1 (74, 84).
|
||||
|
||||
A3 -- Ablation table (what each component buys; the arms you named). One row per
|
||||
arm at matched seed/preset, deploy hack + solve:
|
||||
@@ -125,7 +126,7 @@ A7 -- Appendix ablation context. Cite results.md Q-rows already run: basis width
|
||||
(Q8), refresh cadence (Q5), teacher mix (Q6), gate mode (Q3), solve-orthog (Q9),
|
||||
pairset content/placebo (Q10). [x] data exists; just needs porting into the paper.
|
||||
|
||||
Next action when 74+79 land: read each per_mode_deploy.json, `just dyn`,
|
||||
Next action when 74+84 land: read each per_mode_deploy.json, `just dyn`,
|
||||
fill A1/A2, append a journal entry. Then queue A5 (the gap).
|
||||
|
||||
## Red-team checklist before publishing (paper-writing evidence standards)
|
||||
|
||||
Reference in New Issue
Block a user