Files
evil_MoE/docs
wassname af420ec855 feat: generation-matched logπ_old baseline + global-quantile gate + frac=0 method
Fixes the frac=0 PPO-clip blow-up: logπ_old is now the behavior policy computed
in each rollout's own sampling mode, so ρ is a true importance ratio. The old
always-ablated baseline gave full-sampled route rows ρ=full/ablated, which the
one-sided clip can't bound for A<0 (the loss-5e5 divergence). ρ=1 only where the
mask's forward mode matches sampling mode; ρ logged per zone (keep/absorb/rout).
Note (Fable review): frac=0.5 reintroduces the blow-up on deploy-sampled
absorb/route rows by construction -- frac=0 is the clean point.

Gate: two-threshold Otsu -> symmetric global-quantile tails (route_tail_q=0.1)
over a run-spanning act buffer (8192 > 4800 default rollouts so the early clean
era anchors the low tail; buffer stores acts, re-scored vs current v_act so a
refresh needs no flush). Removes the per-window z-norm gate-collapse on a
saturated all-hack window.

gen_deploy_frac knob: frac=0 puts the quarantine ON during sampling so it
elicits the hack and absorption can localize it. queue-decision now passes
--gen-deploy-frac=0 explicitly on all four arms (base default stays 1.0 = the
job-34 config where ablation RAISED hack 0.71->0.86).

Docs: AGENTS.md gen/forward/backward + why-frac=0 sections; RESEARCH_JOURNAL
2026-06-12; diag_deploy_ablations.py (quar-only vs deploy localization probe).

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-12 03:22:48 +00:00
..
2026-05-23 14:19:41 +08:00
2026-06-02 02:06:43 +00:00
2026-05-29 06:29:20 +00:00
2026-05-23 11:26:39 +08:00