mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 17:30:41 +08:00
4a002e942f
Huang related-work bullet now states the actual differences (SVD of clean update trajectory + warmup vs our contrastive pair-gradients in delta_S coords; they project onto trusted, we project out hack; we quarantine+delete at deploy, they only constrain training). Renamed docs/papers/grad_routing/paper_deng_* -> paper_huang_* (untracked note; correct attribution is Huang et al. 2026). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>