mirror of
https://github.com/wassname/weight-steering.git
synced 2026-06-27 20:54:46 +08:00
3c9fb8d1f5
Addresses three concerns from docs/review/v6_hypothesis_review.md: 1. R_w split into oproj/downproj + Frobenius-balanced combined. 2. dW_left_basis_ceiling as the true weight oracle. 3. axis_kind tag (write/read/mixed/ceiling). Single-seed result: chars_clusters and attn_min_taskdiff are top-5 by both R_act and R_w_combined. Write-family bases (write/mlp_write/global_write) all have R_w_combined ~ 1.0 (random null) -- natural weight-side bases fail the weight-axis test. Multi-seed deferred to v7b.