qlora+bs=4 batched heal, walk-C bisection, round-loosened barrier

mirror of https://github.com/wassname/steer-heal-love.git synced 2026-06-27 16:47:16 +08:00

- QLoRA (4-bit NF4) base frees ~6GB -> train_bs=4 + grad_accum=4
  (block/Linear-level hooks survive bnb Linear4bit: add to dequantized
  output, same pattern as peft randlora/bnb.py)
- walk-C: log-kappa bisection dose controller, ~5 probes of 8 gens to
  highest kappa with >=75% filter survival, then collect to n_keep
- filter: char-level n-gram rep (catches TTTT/!!!! loops), ppl over the
  tail 25% of completion (steering collapses mid-completion)
- lam_round_pow<0 loosens the KL-to-base barrier with round
  (lam_eff=lam/sqrt(1+N)): only the cumulative-vs-fixed-anchor barrier
  self-inflates with round; per-increment spectral_lam + weight_decay
  stay flat
- alphas capped at 1.0, gen_pass_target 0.75

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>

This commit is contained in:

wassname

2026-06-09 10:42:01 +08:00

parent 18f9127fbf

commit 5ce8a00547

10 changed files with 398 additions and 131 deletions

									
										pyproject.toml
									
		+1
		
												View File
												
				@@ -27,6 +27,7 @@ dependencies = [

				    "tiny-mfv",

				    "srsly>=2.5.3",

				    "kaleido>=1.3.0",

				    "bitsandbytes>=0.49.2",

				]

				[tool.uv.sources]