mirror of
https://github.com/wassname/isokl_steering_calibration.git
synced 2026-06-27 17:16:09 +08:00
feat(spaghetti): per-trajectory KL/p95 normalization + shared y-axis
- Add --normalize-kl / --calib-tokens (default on, 20 tokens) to spaghetti_kl_alive.py: each trajectory is divided by its own p95(KL[:calib_tokens]). Calibration target collapses to y=1. - Share y-axis across all alpha panels (global p99 ymax) for direct cross-alpha comparison. - Add OLMo-2 1B w=4096 figure to figs/ and a README section documenting the result, including the unexplained 'random dead traces at low alpha' observation.
This commit is contained in:
Binary file not shown.
|
After Width: | Height: | Size: 401 KiB |
Reference in New Issue
Block a user