journal: routeV margin band (p75/p75) verified routing 28.7% on real 4B; dir6 restarted on it

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
wassname
2026-06-07 14:15:12 +00:00
parent d9ea20baa4
commit eeee7db65c
+36
View File
@@ -3166,3 +3166,39 @@ split padded+concatenated with no shape error.
and read `slv_dep`: the direct test of whether the exploration floor lifts deploy-solve
vs the no-floor job 60. Keep the orthogonal `hkgap`-decay question (frozen vs
`--vhack-refresh-every=2`) as a separate run so the two levers stay attributable.
# 2026-06-07
## routeV band: widest -> p75/p75 margin (route the confident tail); dir6 sweep restarted on it
**Change (commit d9ea20b).** `route_band_edges` switched from the widest edge
(`lower=min clean, upper=max hack`) to a precision margin band
(`lower=p75 clean, upper=p75 hack`). The wide band routed even neutral rollouts
(~0.4 of a cos=0 gradient), the over-route that costs deploy-solve. Margin routes
only the live tail above the clean cluster and lets absorption cover the unrouted
middle (gradient_routing.md Fig 5-right: retain cost is proportional to routed mass;
SGTM Fig 5b: ~40% undiscovered tolerated, leak<0.02). p75 not min/max because 10
pairs make the extremes single-sample noisy; p25(clean) rejected (would route clean,
the expensive false-positive). It is an absolute cos threshold, so a clean batch
routes ~nothing without the per-batch-quantile pathology.
**Risk checked: does the off-distribution pair band sit above live and route ~nothing?**
No. On real Qwen3-4B the band built at `lower=+0.037, upper=+0.256` (vs live median
cos ~-0.06 from the wide run). Job 9 (per-token) routed frac f = +0.287 at step 1
(step 0 = 0.000, first-step gate-cache artifact). ~29% routed, comparable to wide's
~23%, NOT collapsed. The low p75(clean) edge (+0.037, vs max(clean)~+0.3) is what
avoids the under-route; choosing p75 over max was load-bearing.
(evidence: logs/20260607T134234_fast_routingV_seed43_dir6_routeV_pertoken_s43.log)
**Caveat for reading dir6.** The whole dir6 directionality sweep (jobs 8-15) was
restarted on this margin band, so its deploy numbers are NOT comparable to earlier
wide-band routeV runs. Per-token jobs (9,11,15) show `nan` in the streaming
keep/resid/rout gauge (a pre-existing `_zone_stats`-on-empty-live fragility poisoning
the mean); per-rollout jobs read clean, frout is in the debug log. Directionality
conclusion (real vs random v_grad) is band-robust either way.
**Next (for the user, not done).** The principled shift-robust gate is a live-cos
rolling quantile (route top-q of live `cos(g,v_grad)`, threshold tracked across steps
so batch composition varies naturally) -- decouples the threshold from how wide the
off-distribution pairs are. Deferred: bigger change, not safe to deploy unattended
across a running sweep.