journal (i): state-of-the-problem -- loop ceiling is coherence collapse not starvation

Three loop arms (#100 starve-crash r5, #101 walk-C full-10r-but-collapse, #102 round-ramp partial) all lose coherence; the constraints only change how it dies. Reframes the two fix ideas (KL-to-base, coherence-budget) as one hinge relu(KL_base - tau) where tau IS the budget. Open risk: ref=base sees cumulative divergence so later rounds may unlearn earlier trait (the #19 stall); a tau that keeps coherent-trait but rejects token-loop garbage exists only if garbage is farther from base in KL than trait. Next: base-anchor tau bracket #103/#104. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-27 17:02:34 +08:00 · 2026-06-06 12:23:46 +08:00
parent 7120ee4217
commit 026de8fd74
1 changed files with 78 additions and 0 deletions
@@ -909,3 +909,81 @@ baseline (#26, task) -- does the round-1/2 distilled adapter beat just system-pr
 authority" at equal coherence? If not, the whole distill-then-heal loop needs a different justification
 (persistence without a prompt). (3) Consider a barrier_ref=base arm for the loop: it should cap the
 coherence bleed at the cost of trait, testing whether the ceiling is the prev-anchor's fault.
+
+
+## 2026-06-06 (i) -- where we are: the loop's ceiling is COHERENCE COLLAPSE, not starvation, and no prev-anchored constraint we have tried stops it
+
+**Introduction.** This is a state-of-the-problem entry, not a new result. Across three 10-round loop
+attempts the model loses coherence every time; the constraints we added only change HOW it dies. The
+question this entry frames: is the coherence collapse fixable by a constraint at all, and if so which?
+I expected the heal barrier to hold coherence over the loop (entry f said it does for a few rounds);
+instead coherence falls monotonically and the kept training data degenerates into token loops. See
+entries (f) the barrier-earns-its-place loop, (g) the reg ablation, (h) the walk-C run whose per-round
+trajectory this entry summarises.
+
+**Methods.** Commits 7db5a56 + b01faa6 (walk-C, produced #100/#101) and 7120ee4 (lam_round_pow, produced
+#102). gemma-3-4b-it, full preset, seed 42, kl_rev tau=0.5 spectral_lam=0.01 barrier_ref=prev, 10 rounds.
+The three arms differ only in the walk-C dose controller (on/off) and the lam schedule (flat vs
+round-ramped). pueue #100, #101, #102 feed the table.
+
+**Results.**
+
+| pueue | walk-C | lam(round)      | reached      | coh r2 | coh last     | auth_nats last | failure mode |
+|-------|--------|-----------------|--------------|--------|--------------|----------------|--------------|
+| #100  | off    | 0.3 flat        | r4, crash r5 | 0.925  | 0.902 (r4)   | -4.22 (r4)     | starve assert, kept 17 < 30 |
+| #101  | on     | 0.3 flat        | r9 (full)    | 0.925  | 0.623 (r9)   | -6.78 (r9)     | coherence collapse, token loops by r7 |
+| #102  | on     | 0.3*(1+round)^0.5 | r4 (killed) | 0.920  | 0.938 (r4)   | -4.71 (r4)     | partial, tracked #101, no coherence gain |
+
+Table 1. Three loop arms, all kl_rev tau=0.5 spectral_lam=0.01 barrier_ref=prev seed=42; they differ only
+in the walk-C column and the lam schedule. coh = p_ans_any on tinymfv (down = less coherent, base 0.996);
+auth_nats (down = more trait, base -2.354); "reached" = last completed round. #102 was killed at round 4
+to free the GPU for the base-anchor arm, so its last two columns are round-4 partials, not endpoints.
+
+Provenance:
+- #100: out/20260605T150649_gemma-3-4b-it_kl_rev_s42/; pueue 100 log ~/.local/share/pueue/task_logs/100.log.
+  Per-round coh r0-r4 = 0.993, 0.987, 0.925, 0.917, 0.902 (one "round N:" INFO line each). Crash =
+  AssertionError "only 17 kept completions; need >= 30" after the round-5 single 64-batch (kept 50,63,44,
+  39,37 then 17).
+- #101: out/20260605T191544_gemma-3-4b-it_kl_rev_s42/ (trajectory.png, events.jsonl); pueue 101 log. Full
+  per-round table in entry (h) Table 1. coh r5-r9 = 0.904, 0.867, 0.713, 0.618, 0.623; auth r9 = -6.781.
+- #102: out/20260606T071737_gemma-3-4b-it_kl_rev_s42/; pueue 102 log. coh r0-r4 = 0.993, 0.989, 0.920,
+  0.903, 0.938; auth r4 = -4.71; lam_eff logged per round = 0.300, 0.424, 0.520, 0.600, 0.671 (= 0.3 *
+  (1+round)^0.5). Killed at round 4 (pueue kill 102).
+- Kept-text degeneration (the failure mechanism), #101 events.jsonl gen records: round 0 alpha 0.5 kept =
+  "Okay, this is a huge ethical dilemma... a resounding no. I would refuse to lie..."; round 7 alpha 1.0
+  kept = "your your your into your of your..."; round 8 alpha 0.5 kept = "of course, their GREUEUTEGLUE
+  GLUTE GLUTE BUILDUTEutive..."; round 9 alpha 1.0 kept = "of those that their GLUTEUTEutive INGutive
+  bigger...". These passed the filter at ppl 2-17 and rep 0.04-0.27 (both under the gates).
+
+All three arms start coherent (coh 0.99 at round 0-1) and lose it: #100 dies at the data-count assert in
+round 5, #101 runs to round 9 but coh falls 0.993 to 0.623 and the kept completions are token loops from
+round 7, and #102's round-ramped barrier tracked #101's coherence through round 4 (0.920 vs 0.925 at
+round 2) with no gain. Trait keeps moving the whole way (auth -2.71 to -6.78 in #101), so the loss is
+coherence, not trait.
+
+**Discussion (speculative).** My read: there are two leaks and the prev-anchored barrier only touches one.
+Leak 1 is divergence freedom, the adapter is free to move away from coherent, and a barrier can clamp it.
+Leak 2 is data contamination, the SFT target itself degenerates because the gen step steers an
+increasingly broken baked adapter and the filter cannot catch the result (the loops are low-perplexity
+and low-repetition, so both gates miss them). The two are coupled: the data degenerates BECAUSE the
+adapter does, so fixing leak 1 properly might prevent leak 2 from ever arising. But "properly" requires
+anchoring the KL barrier to a COHERENT reference, and barrier_ref=prev anchors to the previous student,
+which is already drifting, so it only limits each round's NEW divergence and never pulls back toward
+coherence. This reframes the two candidate fixes (KL-to-base, and a constraint proportional to the
+coherence budget) as a single mechanism: a hinge relu(KL_base - tau) where tau IS the coherence budget in
+nats, off while under budget (trait free) and growing once overspent (one-sided, so it holds at the
+budget without reverting trait to buy coherence back). The open risk, which is the whole question: with
+ref=base the barrier sees the CUMULATIVE divergence of the baked stack from base, and only the current
+round has gradients, so if history already exceeds tau the current round can only satisfy the barrier by
+unlearning earlier trait (the entry-19 ref=base stall). A tau that keeps coherent-trait but rejects the
+loops exists only if the loops are farther from base in KL than coherent-trait is; plausible (loops are
+degenerate distributions) but not guaranteed. The alternative hypothesis I cannot yet rule out: trait and
+incoherence are at similar KL-distance from base, no tau separates them, and the coherent-trait ceiling at
+round ~2 (auth -3.8, coh 0.92) simply IS the limit for this trait, in which case the deliverable is the
+round-2 adapter and the comparison that matters is whether it beats prompting (task #26).
+
+**Next.** (1) base-anchor tau bracket: barrier_ref=base, spectral off, ramp off, lam 0.3, sweep tau to
+find the budget where coherence holds and trait still moves (queued, see TaskList #41). (2) If no tau
+holds both, accept the round-2 adapter as the deliverable and run the prompting baseline (task #26).
+(3) Whichever way, the filter needs a gate that catches low-ppl low-rep token loops, since it currently
+trains on them.