probe_traj: side-by-side vanilla-vs-projected trajectory analyzer

Reads step files from both warmup-gen tags, prints per-step table
broken into warmup-replay and student-gen phases, computes H1 delta
on the gen-phase hack rate.
This commit is contained in:
wassname
2026-05-25 12:26:03 +00:00
parent a1fdb45251
commit a26f71ef1a
2 changed files with 117 additions and 0 deletions
+4
View File
@@ -212,6 +212,10 @@ probe-projected-replay steps="20":
probe-uat:
uv run python -m projected_grpo.probe_uat
# Trajectory comparator for the warmup-gen runs (vanilla vs projected).
probe-traj:
uv run python -m projected_grpo.probe_traj
# Phase 2 pilot analyzer: reads out/train_pilot_*.safetensors, prints trajectories
# and per-arm aggregates, applies decision rules from spec2.md.
phase2-analyze pattern="_pilot_*":