This commit is contained in:
wassname
2026-02-22 21:52:57 +08:00
parent 2e3c8165ce
commit 7dbe3faf19
10 changed files with 5556 additions and 110 deletions
+14 -11
View File
@@ -2,19 +2,22 @@ TASK write a new file, from the old part.
## Status: DONE
### Task 1: adapters_as_hypotheses.md
- [x] Preamble with pragmatic interpretability framing
- [x] 30 entries (1-30) with pseudocode, hypothesis, evidence, grade
- [x] 33 entries with pseudocode, hypothesis, evidence, grade
- [x] All papers saved to docs/ (full size, no truncation)
- [x] URLs from gist_content.md included
- [x] Sub-agent review completed, fixes applied:
- Fixed RandLoRA pseudocode (sum of scaled random bases, not single triple product)
- Fixed authorship (AntiPaSTO is Clark, not Bini/Girrbach/Akata)
- Fixed SSVD grade (** not **!) and evidence ("matches" not "outperforms")
- Fixed OFT pseudocode (W @ R^T convention per paper)
- Fixed AntiPaSTO Cayley convention to show explicit /2
- Added AntiPaSTO grade caveat (<=4B models, seed variance)
- Split Bone/Trainable Tokens into separate entries
- Fixed "Clark et al." -> "Clark"
- [x] Sub-agent review completed, fixes applied
### Task 2: adapters_vargdown.argdown (NEW)
- [x] Compiled evidence into vargdown (verified argdown) format
- [x] 6 thematic argument groups: SVD basis, orthogonal, decoupling, gain control, rank, functional architecture
- [x] Main thesis: [Natural Manifold] -- SVD basis + orthogonal constraints define natural intervention manifold
- [x] ~20 observations with exact blockquotes from docs/ evidence files
- [x] ~10 assumptions for papers without frozen evidence
- [x] 3 contrary arguments (gain control, rank secondary, linearity)
- [x] Pseudocode companion: adapters_pseudocode.md (20 methods in pseudopy format)
- [x] Sub-agent review: fixed 5 critical (wrong evidence links, paraphrased quotes), 7 minor (orphans, credence calibration)
- [x] All credences calibrated: reason first, no overconfidence on preprints
First write also preamble explaining why we are interested, and this view, about a pragmatic search for effective views on internals (see https://www.lesswrong.com/posts/StENzDcD3kpfGJssR/a-pragmatic-vision-for-interpretability, and