diff --git a/AGENTS.md b/AGENTS.md index dba2795..f661ba7 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -82,22 +82,14 @@ Inherit global rules from `~/.claude/CLAUDE.md`. ## Files -START HERE to understand the setup (read before reasoning about the method): -- [docs/human_journal.md](docs/human_journal.md) -- the user's own words: what the method is, - the routing math (absorption ramp between clean-cos and hack-cos bounds), and the LIVE open - question -- "is it the direction, the routing itself, or does the SVD/PiSSA adapter add a - prior that makes absorption work?" Random-direction controls MATCHING the real direction is a - KNOWN, embraced result, not a bug to explain away. -- [docs/writeup/main.tex](docs/writeup/main.tex) -- the actual thesis and claims C1-C4. The - contribution is NOT "we found the hack direction and erased it." It is: SGTM-style - post-backward gradient routing in the SVD-of-W basis, gated by an extracted hack *vector* - (not per-example data labels), with the routed mass parked in a deletable adapter. C3 already - establishes the gate is largely non-directional; the direction's measurable role is solve - preservation + held-out-mode generalisation (C2, the load-bearing no-cheat check). +For the setup, read these: +- [docs/human_journal.md](docs/human_journal.md) -- the user's notes on the method. The novel + part is routing by an extracted vector rather than per-example labels. The SVD adapter is a + detail, not the novel experiment. Whether the direction, the routing, or the SVD adapter + drives the suppression is an open question (random directions match in the controls). +- [docs/writeup/main.tex](docs/writeup/main.tex) -- the writeup: thesis and claims C1-C4. - [docs/papers/grad_routing/paper_gradient_routing.md](docs/papers/grad_routing/paper_gradient_routing.md) - -- Cloud et al. Expand-Route-Ablate. "Absorption" is the EFFECT of routing (routing a limited - signal localises the broader capability into the routed region), not a mechanism you invoke. - Routing runs the whole train; ablate once at the end. There is no warmup-then-off schedule. + -- Cloud et al. Expand-Route-Ablate, the gradient-routing prior. - Read [docs/brainstorm/extracted_prefs.md](docs/brainstorm/extracted_prefs.md) for design rationale. - New sweep arms get recipes in [justfile](justfile) with `# H:` hypothesis comments.