evil_MoE

mirror of https://github.com/wassname/evil_MoE.git synced 2026-06-27 16:15:35 +08:00

Author	SHA1	Message	Date
wassname	04a98b321e	feat: Evil MoE — learned soft router + pin loss on an ablatable hack expert Fork of vGROUT. Replaces routeA's fixed v_act quantile gate with a learned per-rollout soft router (HackRouter, seeded from v_act) on the ablatable hack expert: GRPO flows into the router through the soft weight w (it concentrates hack-like rollouts in the hack expert), and a continuous pin loss on the hand-authored pairs anchors the axis. No load balancing; routing is per rollout. lora2r gains a soft-weight forward path (_lora2r_w: w=0 keep, w=1 rout, deployed grad scaled by 1-w). train_moe.py is the on-policy GRPO loop; verify_moe_router.py gates the routing invariants. `just smoke` is green. README/AGENTS rewritten for the fork; original proposal kept as docs/spec/original_evil_moe_spec.md. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-14 11:25:14 +08:00
wassname	cca7150ea0	tidy	2026-06-14 11:05:54 +08:00
wassname	270c4f5a27	misc	2026-06-11 11:07:28 +00:00
wassname	9fd2b6b89b	test: add mixed-batch per-rollout routing gate to verify_lora2r_routing (T8) 2a-2c only tested UNIFORM masks. 2d puts rollout 0 clean (0,0) and rollout 1 hack (1,1) in ONE forward and asserts the mixed deployed grad == rollout-0-alone-clean and the mixed quarantine grad == rollout-1-alone-hack -- the load-bearing per-rollout mask vectorization ([G,1,1] reshape) with no cross-rollout bleed. Green on tiny-random. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-10 11:24:49 +00:00
wassname	5c97975185	refactor: collapse to lora2r-only (none/routeV/absorb); delete erase/antipasto/lora_frozen_b paths train.py rewritten straight-line for the single rank-2r Gaussian-init LoRA adapter and three arms (intervention none\|routeV\|absorb). Removes the erase grad-surgery, act_vote/online_stats gates, beta/KL reference path, per-source split harvest, the v_hack injection block, and all per-mechanism E/C/D/A-B tallies. Folds in: - T2 Gaussian init (lora2r.py): A0~N(0,1/d_in), B0~N(0,1/2r), net delta 0 at init. - T3 width-pooled gate labels: single (num/den) fraction across modules, skip zero-width modules, raise if none separate (was per-module equal-weight blowup). - T5 absorb arm: masks pinned (1,0) -> both blocks train, no gate. - T6 self-contained ckpt: A/B/A0/B0 in one file (no _hack file, no SVD cache), adapter:"lora2r" in saved cfg. - T8 m3: step_flagged logs the hack share (d.mean), not m.mean. Gates green: verify_lora2r_routing (4 invariants) + smoke none/routeV/absorb end-to-end on tiny-random Qwen3 (logs in /tmp/claude-1000/smoke_*.log). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-10 10:58:22 +00:00
wassname	6094568c56	feat: lora2r adapter (rank-2r PiSSA-init LoRA) + SGTM three-way hard routing Structural-separation arm to disentangle directionality from shrinkage. A rank-2r PiSSA-init LoRA with A and B both trainable, partitioned into a deployed block [:r] and a quarantine block [r:] (spectrum-matched via alternated SVD axes). Unlike the same-basis PiSSA routeV (where deploy-ablation only removes a magnitude slice of one shared update = shrinkage null), each block has its own input-side A rows and output-side B columns, so deploy-ablation removes a different FUNCTION. Routing = SGTM-style three-way hard per-rollout masks from the cosine of the deployed block's gate-pass gradient to the pair-extracted v_grad: clean (m=0,d=0) trains deployed only; hack (m=1,d=1) detaches deployed output so only the quarantine updates (SGTM grad-retain trick); mid (m=1,d=0) trains both (absorption). Gate is no-cheat: cos to the hand-authored-pair direction, never an oracle label of a live rollout. verify_lora2r_routing.py gates identity-at-init, the three-way block-grad routing, per-rollout c-probe recovery, and ablation teeth; wired into smoke-lora2r. Additive: PiSSA / lora_frozen_b paths untouched. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-10 09:25:58 +00:00

6 Commits