gen_filter_walk: per round, cool a steering multiplier kappa and top up with
extra gen batches until min_train coherent survivors are banked, so the loop
cannot starve on data count (#90/#100 died at the min_train assert). Paired
#101 (walk-C ON) vs #100 (walk-C OFF, identical config): #101 reaches round 9
where #100 asserted at round 5.
Finding (journal h): walk-C removes the starve CRASH but the real ceiling is
coherence collapse, not data count. Trait over-drives to auth -6.8 while coh
falls 0.99 -> 0.62 and the kept completions degenerate into token loops
("BUILDUTEutive...", "GLUTE GLUTE") by round 7 -- low-entropy so they slip
under ppl_tau and rep_tau and train the next adapter on garbage. Coherent
deliverable is the round 1-2 adapter (auth -3.3 to -3.8 at coh 0.99-0.93).
config: lam 1.0->0.3, spectral_lam 0->0.01 (locked from #98/#99 ablation),
gen_pass_target/gen_kappa_decay/gen_kappa_min/gen_max_batches walk-C knobs.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
5-model panel (deepseek-v4-pro, grok-4.3, gemini-3.5-flash, qwen3.6:35b).
Two confirmed bugs fixed; design risks recorded in spec.md.
run.py cue: coh_cost is a pure ratio, so a model collapsing to ~0 mass on
Authority sent dAuth->-inf, coh_cost->0, scoring a broken model green
(gemini). Now check an absolute coherence floor (coh<0.85 -> red) and
finiteness FIRST, require coh>=0.95 for green, and broaden surgicality to
|dAuth| > max(|dCare|,|dFair|) (a Fairness-ward dump was passing Care-only).
heal.py: BPE-boundary prefix assert escaped at the max_len/truncation
boundary (grok/gemini/qwen unanimous). Assert the surviving overlap
min(n_prompt,L) unconditionally; warn instead of silently skipping a kept
completion truncated to zero target tokens.
Verified false positives (recorded so they aren't re-chased): qwen's
shape[0] "batch-dim" claim (.input_ids[0] already drops batch), the
profile['model'] column (it is the marginal mean-p), the KL reference
(c=0.0 + no baked = pristine round-0).
UAT: fast-dev-run exit 0; cue shows coh=0.00 -> red (floor closes the hole).
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
The trait metric was taking the diagonal of tinymfv's raw pre-softmax BMA
`score` logit (unnormalised), giving base Authority ~-5 and absurd 8-nat
swings, then comparing those to steering-lite's 0.5-2 nat reference -- which
is a DIFFERENT metric (loading-weighted Delta-logit of binary p(is-wrong)).
Wrong scale, wrong comparison.
Fix: auth_nats = mean log p[authority] on authority-defiance vignettes (the
NORMALIZED choice logprob, the diagonal of the softmax `p`). Base ~log(0.099)
= -2.3, real shifts ~1-3 nats. DRY: evaluate_model now calls foundation_nats.
Also:
- diag_stages: steer at operating point c=0.5 (c=1 collapses coherence to
0.05), add coh_cost = |dCoh|/|dAuth| (coherence lost per nat of behaviour)
to answer "is the adapter a better pareto than raw steering?".
- diag_csweep: drop the bogus 0.5-2 steering-lite anchor; SocialNorms
co-moving with Authority is expected (both binding foundations), not collapse.
- gitignore out/ and results.tsv (experiment outputs, stale schema).
- personas docs (steering-lite proper-pair rules), spec Plans B/C/D, journal.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Setup per setup-repo conventions: uv + justfile + fast-dev-run on
wassname/qwen3-5lyr-tiny-random, package under src/steer_heal (config +
pipeline skeleton). Stages fail fast with NotImplementedError pointing at
the docs/vendor module to port from.
Design in spec.md: distil a steering-lite mean-diff teacher vector (iso-KL
dosed) into a conditioned LoRA, heal incoherency with a KL-rev-to-original
barrier, fold each round via w2schar gated bake, eval on tinymfv. Three
uncertainty gates (filter / heal / iterate) each with a UAT artifact.
Base model google/gemma-3-1b-it (RTX 3090, 24GB). Reference repos vendored
under docs/vendor (gitignored): steering-lite, isokl, tinymfv, w2schar-mini.
The lighter three are editable path deps; w2schar (py3.13 + flash-attn) is
reference-only, we copy its adapter/bake/plot modules.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>