Files
evil_MoE/docs
wassname a83953131e spec: drop live-detector validation; per-rollout granularity (paper-backed) + cheap label-free diagnostics
Validation removed: running the weak detector over student rollouts at train
time is the no-cheat violation, and a live validation is complex/non-causal.
Causal proof stays downstream (deploy perf + real-vs-random). Train-time only
LOGs label-free gauges: hkgap=upper-lower, leave-one-pair-out separation (the
'does the threshold generalize to a second pair' test), live cos_b percentiles
vs [lower,upper] (calibration read with no labels), route_frac mass at 0/1,
resid=cos(g_keep,vec).

Granularity decided = per-rollout: train.py already sums per-token gate grads
to [G,r] and recovers g_b=cg/dS per rollout; band just swaps the cos_b>tau line
for the ramp. Backed by the papers: Gradient Routing (Cloud 2024) masks
per-token for LLMs / per-episode for RL; SGTM (2025) per-example, label-noise-
robust. Both route by a DATA-LABEL mask; we route by gradient ALIGNMENT to an
extracted direction -- that's the novelty. Borrow their 'absorption' as the
mechanism justifying A->B generalization.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-06 02:23:58 +00:00
..
2026-05-23 14:19:41 +08:00
2026-06-02 02:06:43 +00:00
wip
2026-05-30 04:33:33 +00:00
2026-05-23 11:26:39 +08:00
2026-05-29 06:29:20 +00:00
2026-05-23 11:26:39 +08:00