Revert ablate lora_c warm-start: variance-PC seed didn't help on SFT

Job 94 result (Qwen3.5-0.8B, GSM8K, 2500 steps, single seed):
  warm-start (top-k S-space output-variance PC):  test 55.6 / valid 64.0, init 33.2s
  random-init (prior default):                    test 56.0 / valid 68.0, init  2.2s

Equal-or-worse accuracy (within single-seed noise) for +31s of calibration init.
The optimal ablation direction is loss-defined, not variance-defined, so seeding
lora_c from the data-variance PC buys nothing here. Reverts fe562c2; ablate is
back to the cheap random-init default. cov_orient (CorDA re-orient) path kept.
The FIXME's actual proposal -- a *contrastive* dS seed -- stays open but needs
pos/neg pairs this SFT benchmark lacks (only relevant for labelled steering).

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
This commit is contained in:
wassname
2026-06-17 20:18:41 +08:00
parent 458c3861e8
commit 09dcfe0d41
2 changed files with 40 additions and 59 deletions
+3 -3
View File
@@ -604,9 +604,9 @@ def run(args: BenchmarkConfig) -> dict[str, Any]:
# downstream task (IPM mode, per CorDA). eva needs only a few batches for its init;
# corda/asvd/cov-orient estimate an input second moment, so we hand them many more
# batches (PEFT calibrates on a few hundred sequences) for a well-conditioned basis.
# antipasto_ablate always calibrates now: group_init warm-starts lora_c from the
# S-space output variance (cov_orient adds the heavier CorDA re-orient on top).
needs_calib = args.variant in ("eva", "antipasto_corda", "antipasto_asvd", "antipasto_ablate")
needs_calib = args.variant in ("eva", "antipasto_corda", "antipasto_asvd") or (
args.variant == "antipasto_ablate" and args.antipasto_cov_orient
)
init_meter = group_init_meter() # wall-time + peak CPU RAM of group_init
if needs_calib:
n_batches = min(4, len(batches)) if args.variant == "eva" else min(64, len(batches))