The train row fell back to per-step hack_s (noisy n=28 train batch) for arms
without a knob-on eval, so vanilla's train/deploy rows looked like different
estimators. Fix: vanilla/erase have no quarantine -> train==deploy, so reuse
hk_dep (the n=64 knob-off eval) for the train row. route2 still uses hk_on
(knob-on eval). Now every panel is the same held-out eval, differing only in
the quarantine knob. Regen source: train_vs_deploy_60.csv (route2 nofloor_rf2
+ vanilla sweep, seed 41, 60 steps).
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>