feat: Evil MoE — learned soft router + pin loss on an ablatable hack expert · 04a98b321e - evil_MoE

mirror of https://github.com/wassname/evil_MoE.git synced 2026-06-27 17:48:43 +08:00

feat: Evil MoE — learned soft router + pin loss on an ablatable hack expert

Fork of vGROUT. Replaces routeA's fixed v_act quantile gate with a learned
per-rollout soft router (HackRouter, seeded from v_act) on the ablatable hack
expert: GRPO flows into the router through the soft weight w (it concentrates
hack-like rollouts in the hack expert), and a continuous pin loss on the
hand-authored pairs anchors the axis. No load balancing; routing is per rollout.

lora2r gains a soft-weight forward path (_lora2r_w: w=0 keep, w=1 rout, deployed
grad scaled by 1-w). train_moe.py is the on-policy GRPO loop; verify_moe_router.py
gates the routing invariants. `just smoke` is green. README/AGENTS rewritten for
the fork; original proposal kept as docs/spec/original_evil_moe_spec.md.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

This commit is contained in:

wassname

2026-06-14 11:25:14 +08:00

parent cca7150ea0

commit 04a98b321e

18 changed files with 8874 additions and 501 deletions

data/pairs/hack_pairs.md

+1184

View File

File diff suppressed because it is too large Load Diff