Files
evil_MoE/src/projected_grpo
wassname 1fb49a3325 log: reprint step-table header every 50 rows; related-work: Piggyback learned-mask critique
Header reprint fixes the variable-width misread trap (20+ unlabeled cols, gn
adjacent to lr). Records the anticipated Piggyback 'why not learn the routing
mask' critique (answer: no-cheat withholds the per-rollout label a learned mask
needs) and LoRA rank-deficiency as mild support for the low-rank hack subspace.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-03 04:46:12 +00:00
..
2026-05-23 10:40:02 +08:00