qlora+bs=4 batched heal, walk-C bisection, round-loosened barrier

- QLoRA (4-bit NF4) base frees ~6GB -> train_bs=4 + grad_accum=4
  (block/Linear-level hooks survive bnb Linear4bit: add to dequantized
  output, same pattern as peft randlora/bnb.py)
- walk-C: log-kappa bisection dose controller, ~5 probes of 8 gens to
  highest kappa with >=75% filter survival, then collect to n_keep
- filter: char-level n-gram rep (catches TTTT/!!!! loops), ppl over the
  tail 25% of completion (steering collapses mid-completion)
- lam_round_pow<0 loosens the KL-to-base barrier with round
  (lam_eff=lam/sqrt(1+N)): only the cumulative-vs-fixed-anchor barrier
  self-inflates with round; per-increment spectral_lam + weight_decay
  stay flat
- alphas capped at 1.0, gen_pass_target 0.75

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
wassname
2026-06-09 10:42:01 +08:00
parent 18f9127fbf
commit 5ce8a00547
10 changed files with 398 additions and 131 deletions
+1
View File
@@ -27,6 +27,7 @@ dependencies = [
"tiny-mfv",
"srsly>=2.5.3",
"kaleido>=1.3.0",
"bitsandbytes>=0.49.2",
]
[tool.uv.sources]