This commit is contained in:
wassname
2026-06-12 04:46:01 +00:00
parent af420ec855
commit 41d225a5ec
8 changed files with 357 additions and 188 deletions
+2 -2
View File
@@ -47,8 +47,8 @@ no oracle or ground-truth label from a training rollout is used during training.
At training time routeA scores each rollout on the no-grad `logp_old` forward it
already needs: an activation-capture hook pools the same bottleneck activations
over completion tokens, and the score is the pooled dot product with `v_act`.
Thresholds come from a rolling buffer of recent scores, z-normalized and split by
two-threshold Otsu into `{keep, absorb, rout}`; until the buffer reaches
Thresholds are the symmetric `route_tail_q` quantiles of a run-spanning score
buffer, splitting rollouts into `{keep, absorb, rout}`; until the buffer reaches
`route_warmup` scores the gate pins absorb. The block masks are set from those labels *before* the single
masked forward+backward, so there is no second gradient pass. A rollout scoring
at or above the upper threshold updates the quarantine block while its deployed