"""Teacher-ablation appendix figure: does cutting the teacher at step 40 stop the vanilla student from hacking? Reads data/teacher_ablation.csv, writes figs/teacher_ablation.{png,pdf}. Claim under test: once the student produces its own hacks, the cached teacher is no longer load-bearing -- removing it at step 40 does not bend the deploy-hack trajectory down. The post-cut segment of the off@40 curve keeps rising, so the teacher is a seeder, not the driver. Caveat baked into the legend: the off@40 run (job 87) used the default fast LR (3e-3) while the teacher-on reference (job 97) used the gentler 1e-3 that survives 200 steps without the over-optimization collapse. The within-run post-cut rise is the confound-free part of the evidence; the matched-LR pair is job 124 (queued). FIXME: jobs 87/97 are the closest match but differ in LR. When job 124 (gentle vanilla teacher-off@40) lands, replace the off@40 rows in teacher_ablation.csv with job 124's trajectory (single-variable vs job 97) and drop the LR caveat. """ from pathlib import Path import polars as pl import matplotlib.pyplot as plt HERE = Path(__file__).parent df = pl.read_csv(HERE.parent / "data" / "teacher_ablation.csv") fig, ax = plt.subplots(figsize=(5.0, 3.2)) styles = { "off@40": dict(color="#c1272d", marker="o", label="teacher off @ step 40 (job 87, lr 3e-3)"), "on": dict(color="#444444", marker="s", label="teacher on throughout (job 97, lr 1e-3)"), } for sched, sty in styles.items(): d = df.filter(pl.col("teacher_schedule") == sched).sort("step") ax.plot(d["step"], d["deploy_hack"], lw=1.6, ms=4, **sty) # teacher-cut marker for the off@40 arm ax.axvline(40, color="#c1272d", ls=":", lw=1.0) ax.annotate("teacher removed", xy=(40, 0.04), xytext=(52, 0.04), color="#c1272d", fontsize=8, va="center") ax.set_xlabel("GRPO step") ax.set_ylabel("deploy hack rate (n=64, T=0.7)") ax.set_ylim(-0.02, 0.65) ax.set_xlim(-3, 203) ax.legend(frameon=False, fontsize=8, loc="upper left") ax.spines[["top", "right"]].set_visible(False) fig.tight_layout() for ext in ("png", "pdf"): fig.savefig(HERE / f"teacher_ablation.{ext}", dpi=150, bbox_inches="tight") print("wrote", HERE / "teacher_ablation.png")