mirror of
https://github.com/wassname/steer-heal-love.git
synced 2026-06-27 16:47:16 +08:00
narrow steer band, assert >=20 train, training table, full gen dumps
Root cause found via diag_axis on 4B: raw mean-diff steered across the 7-layer band (0.4-0.6) at coeff=1 DESTROYS gemma-3-4b (coherence 1.00->0.02). That starved the filter to 2 kept completions, so the "adapter" was ~untrained (2 examples) = base behaviour, my Q1 "promising" read was not validated. Fixes: - separate steer_layers (narrow 0.45-0.55) for the vector from layer_range (broad 0.0-1.0) for the LoRA; they were wrongly coupled - lower alpha sweep (0.25,0.5,1,2); n_prompts=16 - assert len(kept) >= min_train(20); TINY=2. Don't train on starved data. - heal training table (loguru+tqdm per token-efficient-logging): step, nll, kl, loss, gnorm + SHOULD - full untruncated steer + adapter generation dumps with prompt and coherence(p_ans_any) inline so we can judge coherence/trait ourselves NOT yet run with fixes on 4B. Base 4B is Care=0.92 (already aligned) -> the prompting-baseline confound (Q7) is now the critical check. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
@@ -16,8 +16,8 @@ sys.path.insert(0, "src")
|
||||
from steer_heal.config import RunConfig # noqa: E402
|
||||
from steer_heal.steering import teacher_vec # noqa: E402
|
||||
|
||||
MODEL = "google/gemma-3-1b-it"
|
||||
cfg = RunConfig(model=MODEL, n_prompts=12)
|
||||
cfg = RunConfig(n_prompts=12) # default model (gemma-3-4b-it)
|
||||
MODEL = cfg.model
|
||||
|
||||
tok = AutoTokenizer.from_pretrained(MODEL)
|
||||
if tok.pad_token is None:
|
||||
|
||||
Reference in New Issue
Block a user