mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 18:04:59 +08:00
5257ff010e
The train row fell back to per-step hack_s (noisy n=28 train batch) for arms without a knob-on eval, so vanilla's train/deploy rows looked like different estimators. Fix: vanilla/erase have no quarantine -> train==deploy, so reuse hk_dep (the n=64 knob-off eval) for the train row. route2 still uses hk_on (knob-on eval). Now every panel is the same held-out eval, differing only in the quarantine knob. Regen source: train_vs_deploy_60.csv (route2 nofloor_rf2 + vanilla sweep, seed 41, 60 steps). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>