mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:15:35 +08:00
route2: fail loud if real v_grad band collapses (extraction broken)
Fresh-eyes review flagged that nothing asserted upper>lower for the REAL v_grad: a broken extraction (hack pairs aligning no more than clean) would silently degenerate into the random-control sign gate via the max(.,1e-6) floor. Assert mean band width > 0 on non-Haar runs; the Haar control is still allowed to collapse. No correctness change to the gate math (review found conservation, per-rollout recovery, cosine masking, closure capture all OK). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -182,6 +182,13 @@ as real). Defence (a) is mandatory; (b) only if (a) shows a mass gap.
|
||||
|
||||
## Implementation plan (src/vgrout/train.py)
|
||||
|
||||
STATUS 2026-06-06 (commit 485839d): route rewrite DONE and smoke-verified. `route_band_edges`
|
||||
builds the band at extract + on refresh; `_route2_grad_filter` is the banded ramp gate;
|
||||
`build_route2_anchors`, the EMA `tau` state, `--gate-anchor-teacher-only`, and
|
||||
`scripts/verify_gate_anchor.py` are gone. Smoke: band width +0.289 real vs -0.014 Haar-random;
|
||||
`||delta_S_hack||>0`, R3 span assert green, resid~0. DEFERRED: the held-out-pair separation
|
||||
gauge (needs a second forward over the `n_val` pairs; diagnostic only, not load-bearing).
|
||||
|
||||
Rollback tag `pre-routing-refactor`. erase already works; the code below is the route rewrite.
|
||||
|
||||
1. **DELETE `build_route2_anchors`** (~line 337) and its call site. No anchors from teacher
|
||||
|
||||
Reference in New Issue
Block a user