mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:45:42 +08:00
329066e99b
Vanilla deploy-hack keeps climbing after teacher cut at step 40 (0.36->0.58, job 87), at/above teacher-on (job 97). Closest-match jobs differ in LR; FIXME to swap in lr-matched job 124 (queued low-prio). CSV is the committed data artifact; fig regen by plot_teacher_ablation.py. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
18 KiB
18 KiB