mirror of
https://github.com/wassname/evil_MoE.git
synced 2026-06-27 16:45:42 +08:00
viz: floor-to-ceiling method comparison (csv + figure)
Two-stage script: build out/plots/floor_ceiling.csv (one row per arm/anchor, with SOURCE and STATUS columns flagging every provisional/missing cell) then the keynote figure. Prints TODO/FIXME data gaps before plotting. Panel A: normalized floor->ceiling bars, headline deploy (knob-off, test n=119). Panel B: the knob effect -- arrow knob-ON -> knob-OFF on the SAME held-out val split (eval_curve.jsonl), isolating the quarantine from the train/test memorization gap. Fixes the earlier conflation where the train->deploy arrow mixed knob-on/off with train-problems/test-problems. Data gaps flagged in csv: solve ceiling provisional=paper 0.223 (FIXME job 24), prog_wide arm contaminated (TODO job 28 prog_wide_clean). Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -281,6 +281,12 @@ plot GLOB='logs/*_sub4_*.log' STEM='out/figs/substrate':
|
||||
plot-deploy GLOB='out/runs/*sub4*/per_mode_deploy.json' OUT='out/figs/deploy_overlay.png':
|
||||
uv run python scripts/plot_deploy_overlay.py {{ GLOB }} --out {{ OUT }}
|
||||
|
||||
# Keynote floor->ceiling method comparison. Builds out/plots/floor_ceiling.csv
|
||||
# (inspectable, with SOURCE + STATUS/TODO columns) then the figure. Prints any
|
||||
# provisional/missing cells (ceiling = job 24, prog_wide clean = job 28).
|
||||
plot-floor-ceiling:
|
||||
uv run python -m scripts.plot_floor_ceiling
|
||||
|
||||
# Regenerate both dynamics plots from the cell logs (default: all cells; pass a
|
||||
# narrower glob like 'logs/*_cell_*_s41.log' for the seed-41-only checkpoint).
|
||||
regen-dynamics GLOB='logs/*_cell_*.log':
|
||||
|
||||
Reference in New Issue
Block a user