Files
evil_MoE/docs/writeup
wassname 03693e4f30 name the method vGROUT (vector gradient routing)
- title: drop the "Quarantine ... Representation?" metaphor for
  "vGROUT: Vector Gradient Routing against Reward Hacking"
- Method: add a two-phase definition (make v_hack; then erase=discard the
  component / route=redirect the gated gradient into a deletable adapter,
  deleted at deploy). Honest framing: route preserves (not discards); follows
  Shilov et al.'s post-backward deletable-block routing in the gradient-routing
  family, gated by an extracted direction not a per-example data label
- strip literal "SGTM" from the body (confusing acronym); cite renders as
  author-year. README + pyproject describe vGROUT (package name unchanged)
2026-06-05 14:51:48 +08:00
..