mirror of
https://github.com/wassname/steer-heal-love.git
synced 2026-06-27 16:47:16 +08:00
qlora+bs=4 batched heal, walk-C bisection, round-loosened barrier
- QLoRA (4-bit NF4) base frees ~6GB -> train_bs=4 + grad_accum=4 (block/Linear-level hooks survive bnb Linear4bit: add to dequantized output, same pattern as peft randlora/bnb.py) - walk-C: log-kappa bisection dose controller, ~5 probes of 8 gens to highest kappa with >=75% filter survival, then collect to n_keep - filter: char-level n-gram rep (catches TTTT/!!!! loops), ppl over the tail 25% of completion (steering collapses mid-completion) - lam_round_pow<0 loosens the KL-to-base barrier with round (lam_eff=lam/sqrt(1+N)): only the cumulative-vs-fixed-anchor barrier self-inflates with round; per-increment spectral_lam + weight_decay stay flat - alphas capped at 1.0, gen_pass_target 0.75 Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -27,6 +27,7 @@ dependencies = [
|
||||
"tiny-mfv",
|
||||
"srsly>=2.5.3",
|
||||
"kaleido>=1.3.0",
|
||||
"bitsandbytes>=0.49.2",
|
||||
]
|
||||
|
||||
[tool.uv.sources]
|
||||
|
||||
Reference in New Issue
Block a user