Files
evil_MoE/out/plots/floor_ceiling.csv
T
wassname 7d08ad2acd viz: floor-to-ceiling method comparison (csv + figure)
Two-stage script: build out/plots/floor_ceiling.csv (one row per arm/anchor,
with SOURCE and STATUS columns flagging every provisional/missing cell) then
the keynote figure. Prints TODO/FIXME data gaps before plotting.

Panel A: normalized floor->ceiling bars, headline deploy (knob-off, test n=119).
Panel B: the knob effect -- arrow knob-ON -> knob-OFF on the SAME held-out val
split (eval_curve.jsonl), isolating the quarantine from the train/test
memorization gap. Fixes the earlier conflation where the train->deploy arrow
mixed knob-on/off with train-problems/test-problems.

Data gaps flagged in csv: solve ceiling provisional=paper 0.223 (FIXME job 24),
prog_wide arm contaminated (TODO job 28 prog_wide_clean).

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-09 09:45:37 +00:00

1.2 KiB

1labelkindhack_deploysolve_deployhack_onhack_offsolve_onsolve_offsourcestatus
2routeV per-tokenmethod0.0420.14290.63120.0250.06880.068820260607T134234_fast_routingV_seed43_dir6_routeV_pertoken_s43/[deploy_test.json + eval_curve.jsonl]ok
3routeV authoredmethod0.07560.11760.66870.01870.05630.043720260608T134141_fast_routingV_seed43_dir8_routeV_authored_perroll_s43/[deploy_test.json + eval_curve.jsonl]ok
4routeV prog_widemethod0.10080.12610.69370.01250.06880.056320260607T195125_fast_routingV_seed43_dir6_routeV_s43/[deploy_test.json + eval_curve.jsonl]TODO: contaminated pairs -> job 28 prog_wide_clean
5routeV random-Vmethod0.10080.10920.70.04370.0750.068820260608T020623_fast_routingV_seed43_dir6_routeV_random_s43/[deploy_test.json + eval_curve.jsonl]ok (directionality control)
6vanilla GRPOmethod0.61340.10080.59380.59380.0750.07520260608T224659_fast_vanilla_seed43_dir8_vanilla_s43/[deploy_test.json + eval_curve.jsonl]ok (defines hack-worst anchor)
7base (floor)anchor_floor0.00.1261*_dir8_baseline_s43/deploy_test.jsonok (base model; steps=0)
8ceilinganchor_ceiling0.00.223Ariahw et al. 2025 (paper), NOT our runFIXME: PROVISIONAL paper 0.223 -- awaiting job 24 (no-loophole ceiling)