mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-07-02 12:13:29 +08:00
ad048e59c6
Old GT_S=6/HACK_S=8 were the pre-sprd/N layout; current table is gt_s=4 hack_s=6, so newer logs were silently mis-read and old distill logs crashed _frac on a non-fraction token. Now locate the train.py streaming header (first token 'step' + 'ref_eq' present) and map columns by name. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>