steer-heal-love

mirror of https://github.com/wassname/steer-heal-love.git synced 2026-06-27 16:47:16 +08:00

Files

T

wassname 48814897ef results: rmse outlier-KL barrier holds coherence over the loop; README + log-incoherence plot

Headline (gemma-3-4b-it s42, care-over-authority): aggregating the kl_rev
barrier by rmse over token positions (not the mean) holds coherence flat at
0.997 across all 8 rounds, where the mean aggregate collapses to 0.62 by r7
(token loops). Mean dilutes the few incoherent positions under the tau gate;
rmse is outlier-sensitive and fires on them. Cost is depth (rmse run leashes
to base, trait stays shallow); matched control still running.

- plot.py: coherence panel -> log-incoherence (1-coh, log axis, down=coherent);
  map coherence axis matches; red steer kept on the over-pipeline panels only.
- heal.py: fix kl_agg=p95 crash (torch.quantile rejects bf16 -> .float()).
- run.py: persist per-round adapter gens (adapter_gen) for the outputs table.
- config.py: coh_floor early-stop knob.
- README: results table (mean vs rmse), trajectory figure, outputs-over-loop
  appendix (per-round completions as quotes); spec persona corrected to pos-neg.
- docs/reviews: kl_agg review, pool saturation test, care-lens plan.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

2026-06-07 07:53:00 +08:00

brainstorming

setup-repo gap-fill: results ledger + docs structure

2026-06-04 09:51:36 +08:00

evidence

setup-repo gap-fill: results ledger + docs structure

2026-06-04 09:51:36 +08:00

literature

setup-repo gap-fill: results ledger + docs structure

2026-06-04 09:51:36 +08:00

personas

metric fix: auth_nats = diagonal log(p) not raw forced-choice logit