refactor: route2 quarantine = scale-matched delta_S_hack, rip out 33M LoRA

The distinct-basis A_q/B_q LoRA (~33M params at rank-16) gave the quarantine a ~100x capacity edge over delta_S, so routing-everything-there was the low- resistance path: qE pinned ~0.97 (energy into the thrown-away knob) while the deployed delta_S learned nothing (job 54). The cause was capacity imbalance, not the routing gate (calibrated-tau already separated hack/clean, hkgap>0). Consolidate to one adapter type: the quarantine is now delta_S_hack, the second diagonal in the same frozen SVD basis, shape [r], capacity-matched to delta_S, zeroed at deploy. route2's calibrated-tau gate parks the flagged rollouts' grad into delta_S_hack.grad (like proj.py's route parks its subspace projection); delta_S keeps the unflagged. Both diagonals train at one shared lr. Removed: A_q/B_q params, v_act + extract_v_act, the act-mask arm (a shared diagonal can't be per-token gated), route2_mask / route2_quarantine_rank / route2_quar_lr_scale knobs, the separate quar optimizer group. Arm name routing2_{act,grad} -> routing2. v_grad refresh extracts from delta_S (main) with the quarantine ablated. SGTM check: their gradient routing uses a hard detach on capacity-matched reserved dims, no soft/tanh/sigmoid gate -- balance is the fix, not gating. Smoked clean: tau/hkgap/qE render, ||delta_S_hack||>0 assert passes, exit 0. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-27 17:30:41 +08:00 · 2026-06-01 02:52:02 +00:00
parent 6eb894f44d
commit 8158adb543
5 changed files with 198 additions and 385 deletions
@@ -84,9 +84,8 @@ over the first few steps. Sanity: by a few steps μ_clean<~0.1, μ_hack>~0.2-ish

 ## Verify / queue / follow

- `just smoke-route2 --route2-mask=grad` (or the smoke recipe that hits grad
-  path): confirm tau/hkgap columns render, routing fires (flagged frac < ~0.5,
-  not pinned at 0.5), exit 0.
+- `just smoke-route2`: confirm tau/hkgap/qE columns render, routing fires
+  (flagged frac < ~0.5, not pinned at 0.5), exit 0.
 - Queue (substrate, 60 steps, seed 41), label:
  why: does per-step calibrated-τ vector routing (route2-grad) stop over-routing
  (flagged<<0.5) and suppress held-out deploy-hack vs vanilla at matched solve;
@@ -98,3 +97,56 @@ over the first few steps. Sanity: by a few steps μ_clean<~0.1, μ_hack>~0.2-ish
 Detector flag (hack_E) only CALIBRATES μ_hack (known A) + anchors teacher hacks.
 The routing GATE is cos>τ applied to ALL rollouts incl. unknown B. gt_pass never
 touches routing. Compliant.
+
+## Status: IMPLEMENTED + RUNNING (job 54)
+
+Implemented in `6eb894f`, smoked (tau/hkgap render, exit 0). Running as job 54
+(calibrated-τ + grad_clip 0.5, route2-grad, 60 steps, seed 41). job 52
+(caltau-alone) killed and folded in; clip 0.5 is a no-op while gn<0.5 so 54 is
+the strict superset.
+
+### Live observation (steps 0-2, 2026-06-01)
+
+The calibration works as a discriminator but does NOT fix over-routing-by-energy:
+- `hkgap` positive and rising (0.00 -> 0.03 -> 0.08): the v_grad direction DOES
+  separate the hack cloud from the clean cloud. The vector is alive.
+- `tau` tracking up with it (0.00 -> 0.02 -> 0.04): the threshold rides the drift
+  as designed.
+- BUT `qE` 0.73 -> 0.97 -> 0.97: ~97% of gradient ENERGY lands in the deleted
+  quarantine by step 1, and `gt_s` 3 -> 7 -> 0 (solving collapsed).
+
+Diagnosis: the failure was never the routing FRACTION (which τ fixes); it is the
+always-summed 33M A_q/B_q quarantine MAGNITUDE. Even a correctly-thresholded route
+sends the routed gradient into a knob whose per-param grads dwarf delta_S's, so
+the energy ratio pins near 1 and the deployed adapter learns nothing. This is the
+SYNTHESIS "next lever" prediction: if qE stays high while hkgap>0, the culprit is
+quarantine magnitude, not the gate.
+
+## DESIGN CHANGE (2026-06-01): one adapter, scale-matched quarantine
+
+Acted on the magnitude diagnosis by removing the distinct-basis LoRA entirely.
+The quarantine is now delta_S_hack -- the SECOND diagonal in the same frozen SVD
+basis, shape [r] per module, identical capacity to delta_S. route2's calibrated-τ
+gate parks the flagged rollouts' delta_S-grad contribution into delta_S_hack.grad
+(via step_grad_hack in _route2_grad_filter), exactly as proj.py's `route` parks
+its subspace-projected component; delta_S keeps the unflagged. Both diagonals
+train at one shared lr; delta_S_hack is zeroed at deploy.
+
+Rationale (user): a 33M LoRA vs a ~2k-param delta_S per module means "dump
+everything in the quarantine" is the low-resistance path -- a capacity edge, not
+honest absorption. Capacity-balanced diagonals remove that bias. SGTM's own
+quarantine is capacity-matched (a split of the same layer, equal dims), and uses
+a hard detach -- no soft/tanh/sigmoid gate -- confirming the fix is balance, not
+gating.
+
+Removed: A_q/B_q params, v_act buffer + extract_v_act, the act-mask arm (a shared
+diagonal can't be per-token gated), route2_mask / route2_quarantine_rank /
+route2_quar_lr_scale knobs, the separate quar optimizer group. arm name
+"routing2_grad"/"routing2_act" -> "routing2".
+
+v_grad refresh extracts from the MAIN knob (delta_S.grad) with the quarantine
+ablated -- the deployed-model gradient is what we route, and both diagonals share
+the basis so the direction is directly usable on delta_S's live gradient.
+
+Smoked clean (tiny-random): tau/hkgap/qE render, ||delta_S_hack||=0.0074>0 assert
+passes, deploy-ablation fires, exit 0. Queued on the substrate (seed 41, 60 steps).