Files
steer-heal-love/scripts
wassname 579e1f6671 metric = log(tinymfv profile p); cue-ball headline; training-table sig figs
After verifying guided.py: tinymfv `score` is already a debiased logprob
((lp_fwd+lp_rev)/2, BMA'd), not a "raw logit", and `p = softmax(score)`. My
two earlier inventions were both wrong:
- log(p) coupled Authority to the other 6 foundations via logsumexp.
- the diagonal (auth-blame on auth-vignettes) is pmass-on-correct-label =
  top1 competence, not the trait, and threw away the FP/FN structure.

Use the library-native readout: auth_nats = log(tinymfv profile p[F]) = log of
the mean p per foundation over ALL vignettes. For small p, log p ~= logit, so
this lands on steering-lite's loading-weighted Δlogit scale (base log(0.099)
=-2.3, real shift ~0.5-2 nats). foundation_nats now reads rep["profile"].

Also:
- run.py: BLUF `main metric:` line with cue ball (🟢/🟡/🔴 by coherence band).
- heal.py: training table to 2 sig figs (nll/kl/loss .2f, gnorm .1f); a
  per-step loss does not warrant 3 decimals.
- diag_stages: accept 1+ ckpts, label each row by its reg from metadata.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-04 15:02:56 +08:00
..