9 Commits

Author SHA1 Message Date
wassname 04a98b321e feat: Evil MoE — learned soft router + pin loss on an ablatable hack expert
Fork of vGROUT. Replaces routeA's fixed v_act quantile gate with a learned
per-rollout soft router (HackRouter, seeded from v_act) on the ablatable hack
expert: GRPO flows into the router through the soft weight w (it concentrates
hack-like rollouts in the hack expert), and a continuous pin loss on the
hand-authored pairs anchors the axis. No load balancing; routing is per rollout.

lora2r gains a soft-weight forward path (_lora2r_w: w=0 keep, w=1 rout, deployed
grad scaled by 1-w). train_moe.py is the on-policy GRPO loop; verify_moe_router.py
gates the routing invariants. `just smoke` is green. README/AGENTS rewritten for
the fork; original proposal kept as docs/spec/original_evil_moe_spec.md.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-14 11:25:14 +08:00
wassname cca7150ea0 tidy 2026-06-14 11:05:54 +08:00
wassname af420ec855 feat: generation-matched logπ_old baseline + global-quantile gate + frac=0 method
Fixes the frac=0 PPO-clip blow-up: logπ_old is now the behavior policy computed
in each rollout's own sampling mode, so ρ is a true importance ratio. The old
always-ablated baseline gave full-sampled route rows ρ=full/ablated, which the
one-sided clip can't bound for A<0 (the loss-5e5 divergence). ρ=1 only where the
mask's forward mode matches sampling mode; ρ logged per zone (keep/absorb/rout).
Note (Fable review): frac=0.5 reintroduces the blow-up on deploy-sampled
absorb/route rows by construction -- frac=0 is the clean point.

Gate: two-threshold Otsu -> symmetric global-quantile tails (route_tail_q=0.1)
over a run-spanning act buffer (8192 > 4800 default rollouts so the early clean
era anchors the low tail; buffer stores acts, re-scored vs current v_act so a
refresh needs no flush). Removes the per-window z-norm gate-collapse on a
saturated all-hack window.

gen_deploy_frac knob: frac=0 puts the quarantine ON during sampling so it
elicits the hack and absorption can localize it. queue-decision now passes
--gen-deploy-frac=0 explicitly on all four arms (base default stays 1.0 = the
job-34 config where ablation RAISED hack 0.71->0.86).

Docs: AGENTS.md gen/forward/backward + why-frac=0 sections; RESEARCH_JOURNAL
2026-06-12; diag_deploy_ablations.py (quar-only vs deploy localization probe).

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-12 03:22:48 +00:00
wassname 270c4f5a27 misc 2026-06-11 11:07:28 +00:00
wassname c390007eb9 human journal 2026-06-09 17:28:15 +08:00
wassname 376dccdd7f writeup: add main.qmd (Quarto draft) + nips-template.tex; update human journal
main.qmd mirrors main.tex structure with markdown prose, callout TODOs,
and Quarto cross-refs. Renders via nips-template.tex which wraps
nips15submit_e.sty so quarto render --to pdf produces NeurIPS-formatted
output. Human journal prose incorporated into abstract + intro + routing
section.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-08 07:00:54 +08:00
wassname bcf09dd742 docs 2026-06-06 12:27:26 +00:00
wassname 0fa250b193 handoff: pre-routing-refactor snapshot + diagnosis
route2 directionality exposed the vector is not load-bearing: hack_anchor
force-routes teacher+detector by label (bypassing v_grad), tau calibrated from a
live detector, so random==real because labels carried it. Redesign: teacher-off@30,
drop force-route, calibrate tau from the A-pairs (no live detector), maybe use the
pairset directly vs a rank-1 vector. Decisive test = A5 real(126) vs random(135).
Queue snapshot + design notes in docs/REFACTOR_HANDOFF.md.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-05 23:58:35 +00:00
wassname 7248d469a7 init 2026-05-23 10:22:54 +08:00