evil_MoE

wassname/evil_MoE

Fork 0

mirror of https://github.com/wassname/evil_MoE.git synced 2026-07-01 07:44:17 +08:00

Commit Graph

Author	SHA1	Message	Date
wassname	5257ff010e	plot_dynamics: train-vs-deploy 2x2 uses matched n=64 eval on both rows The train row fell back to per-step hack_s (noisy n=28 train batch) for arms without a knob-on eval, so vanilla's train/deploy rows looked like different estimators. Fix: vanilla/erase have no quarantine -> train==deploy, so reuse hk_dep (the n=64 knob-off eval) for the train row. route2 still uses hk_on (knob-on eval). Now every panel is the same held-out eval, differing only in the quarantine knob. Regen source: train_vs_deploy_60.csv (route2 nofloor_rf2 + vanilla sweep, seed 41, 60 steps). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-05 02:33:10 +00:00

Author

SHA1

Message

Date

wassname

5257ff010e

plot_dynamics: train-vs-deploy 2x2 uses matched n=64 eval on both rows

The train row fell back to per-step hack_s (noisy n=28 train batch) for arms
without a knob-on eval, so vanilla's train/deploy rows looked like different
estimators. Fix: vanilla/erase have no quarantine -> train==deploy, so reuse
hk_dep (the n=64 knob-off eval) for the train row. route2 still uses hk_on
(knob-on eval). Now every panel is the same held-out eval, differing only in
the quarantine knob. Regen source: train_vs_deploy_60.csv (route2 nofloor_rf2
+ vanilla sweep, seed 41, 60 steps).

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-05 02:33:10 +00:00

1 Commits