5 Commits

Author SHA1 Message Date
wassname 5ce8a00547 qlora+bs=4 batched heal, walk-C bisection, round-loosened barrier
- QLoRA (4-bit NF4) base frees ~6GB -> train_bs=4 + grad_accum=4
  (block/Linear-level hooks survive bnb Linear4bit: add to dequantized
  output, same pattern as peft randlora/bnb.py)
- walk-C: log-kappa bisection dose controller, ~5 probes of 8 gens to
  highest kappa with >=75% filter survival, then collect to n_keep
- filter: char-level n-gram rep (catches TTTT/!!!! loops), ppl over the
  tail 25% of completion (steering collapses mid-completion)
- lam_round_pow<0 loosens the KL-to-base barrier with round
  (lam_eff=lam/sqrt(1+N)): only the cumulative-vs-fixed-anchor barrier
  self-inflates with round; per-increment spectral_lam + weight_decay
  stay flat
- alphas capped at 1.0, gen_pass_target 0.75

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-09 10:42:01 +08:00
wassname 933ce38b0b trajectory plot (steer/heal zigzag + trait-coherence pareto) + barrier-vs-nll gradient pressure log
- plot.py write_trajectory: auth zigzag (steer red / heal green) over the pipeline,
  coherence panel below sharing x, and a trait(x)-vs-coherence(y) pareto map with
  separate steer/heal trajectories from base. PNG via kaleido + interactive html.
  Fixed coherence axes to [0.83,1.01] so ~0.001 noise does not fill the panel.
- run.py: build a stages list carrying full eval dicts; derive the stage table from
  it; persist the steered eval to events.jsonl; render trajectory at end of run.
- heal.py: log g_bar/g_nll = ||grad barrier|| / ||grad sft|| at each logged step.
  >>1 = barrier over-tight (undoing trait); 0 = inert.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-04 17:21:10 +08:00
wassname 4094a295b2 readme 2026-06-04 10:05:38 +08:00
wassname 940a3742c5 scaffold steer_heal: spec, repo infra, vendored deps
Setup per setup-repo conventions: uv + justfile + fast-dev-run on
wassname/qwen3-5lyr-tiny-random, package under src/steer_heal (config +
pipeline skeleton). Stages fail fast with NotImplementedError pointing at
the docs/vendor module to port from.

Design in spec.md: distil a steering-lite mean-diff teacher vector (iso-KL
dosed) into a conditioned LoRA, heal incoherency with a KL-rev-to-original
barrier, fold each round via w2schar gated bake, eval on tinymfv. Three
uncertainty gates (filter / heal / iterate) each with a UAT artifact.

Base model google/gemma-3-1b-it (RTX 3090, 24GB). Reference repos vendored
under docs/vendor (gitignored): steering-lite, isokl, tinymfv, w2schar-mini.
The lighter three are editable path deps; w2schar (py3.13 + flash-attn) is
reference-only, we copy its adapter/bake/plot modules.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-04 09:49:31 +08:00
wassname b98535066a spec done 2026-06-04 09:42:27 +08:00