mirror of
https://github.com/wassname/steer-heal-love.git
synced 2026-06-27 16:47:16 +08:00
4b8860d7cb
Add the by-question results infra per setup-repo conventions:
- results.tsv append at end of each finished run (config + final metrics + argv)
- scripts/results.py groups by arm (reg) into a markdown table; `just results`
- docs/results.md curated by-question snapshot (U2 regulariser comparison)
- docs/{spec,brainstorming,literature,evidence} structure
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
1.1 KiB
1.1 KiB
Results, organized by the question each run answers
Regenerate the tables with just results (groups results.tsv by arm). This file curates the answers; the append-only narrative lives in RESEARCH_JOURNAL.md.
How to read this
authis the tinymfv authority-axis mean for the steered student (higher = more of the trait);coherenceisp_ans_any(fraction of eval items where the model commits to a valid answer). Both are absolute fractions, compare rows within a table by eye.- A regulariser is compared to the
nllcontrol only at matchedauth(the U2 crux: more coherence at equal trait shift). auth_sdis the across-seed spread; a blank means a single seed.- Provenance for each table goes in an HTML comment so any row can be re-created.
Q (U2). Which regulariser heals incoherency best at matched trait shift?
Prior: kl_rev > kl_fwd ~ wd > nll (reverse KL is mode-seeking, suppresses the low-original-probability tokens that read as incoherent).
No runs yet. Table appears here once sweep-reg has produced rows.
Answer: pending.