Files
evil_MoE/scripts/attic
wassname a3a3f09824 retract 'null_city contaminated' framing -> in/out-of-subspace + cosine-is-correlational
Haar's ~0 cos is concentration of measure (out-of-subspace), not a cleaner
placebo. Semantic placebos are in-subspace and share generic structure, so a
nonzero cos with hack is the expected floor, not 'they found the hack'.
null_city's high-cos modules are plausibly low-rank-module artifacts. Cosine
is correlational; the ablation run is the causal test.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-05 09:21:41 +00:00
..

attic

Parked, not deleted. Superseded exploration kept only because the persona-pair methodology may get cited in the paper appendix.

  • make_pairsets.py, make_dataset_pairsets.py — persona contrastive-pair authoring (tasks #123-126, done). The live extraction path is pairs.PAIRS (hand pairs) or pairs_from_pool (pool-derived). No justfile recipe builds these anymore.