wip

2026-06-27 16:44:10 +08:00 · 2026-02-22 21:52:57 +08:00
parent 2e3c8165ce
commit 7dbe3faf19
10 changed files with 5556 additions and 110 deletions
@@ -2,19 +2,22 @@ TASK write a new file, from the old part.

 ## Status: DONE

+### Task 1: adapters_as_hypotheses.md
 - [x] Preamble with pragmatic interpretability framing
- [x] 30 entries (1-30) with pseudocode, hypothesis, evidence, grade
+- [x] 33 entries with pseudocode, hypothesis, evidence, grade
 - [x] All papers saved to docs/ (full size, no truncation)
- [x] URLs from gist_content.md included
- [x] Sub-agent review completed, fixes applied:
-  - Fixed RandLoRA pseudocode (sum of scaled random bases, not single triple product)
-  - Fixed authorship (AntiPaSTO is Clark, not Bini/Girrbach/Akata)
-  - Fixed SSVD grade (** not **!) and evidence ("matches" not "outperforms")
-  - Fixed OFT pseudocode (W @ R^T convention per paper)
-  - Fixed AntiPaSTO Cayley convention to show explicit /2
-  - Added AntiPaSTO grade caveat (<=4B models, seed variance)
-  - Split Bone/Trainable Tokens into separate entries
-  - Fixed "Clark et al." -> "Clark"
+- [x] Sub-agent review completed, fixes applied
+
+### Task 2: adapters_vargdown.argdown (NEW)
+- [x] Compiled evidence into vargdown (verified argdown) format
+- [x] 6 thematic argument groups: SVD basis, orthogonal, decoupling, gain control, rank, functional architecture
+- [x] Main thesis: [Natural Manifold] -- SVD basis + orthogonal constraints define natural intervention manifold
+- [x] ~20 observations with exact blockquotes from docs/ evidence files
+- [x] ~10 assumptions for papers without frozen evidence
+- [x] 3 contrary arguments (gain control, rank secondary, linearity)
+- [x] Pseudocode companion: adapters_pseudocode.md (20 methods in pseudopy format)
+- [x] Sub-agent review: fixed 5 critical (wrong evidence links, paraphrased quotes), 7 minor (orphans, credence calibration)
+- [x] All credences calibrated: reason first, no overconfidence on preprints

 First write also preamble explaining why we are interested, and this view, about a pragmatic search for effective views on internals (see https://www.lesswrong.com/posts/StENzDcD3kpfGJssR/a-pragmatic-vision-for-interpretability, and