mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:30:30 +08:00
9fd2b6b89b
2a-2c only tested UNIFORM masks. 2d puts rollout 0 clean (0,0) and rollout 1 hack (1,1) in ONE forward and asserts the mixed deployed grad == rollout-0-alone-clean and the mixed quarantine grad == rollout-1-alone-hack -- the load-bearing per-rollout mask vectorization ([G,1,1] reshape) with no cross-rollout bleed. Green on tiny-random. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>