lora-lite

mirror of https://github.com/wassname/lora-lite.git synced 2026-06-27 16:30:44 +08:00

Files

T

wassname 9d027752ad variants: replace arrow's dense block with diagonal-plus-low-rank core

antipasto_arrow -> antipasto_dplr. The arrowhead's dense b x b block is the wrong
shape: b^2 params, mixes only the top-b, and sits on the S-scaled coords so its
perturbation is amplified by the largest singular values (block=128 collapsed to
45.7% at the gain's lr). Replace it with LoRA's lesson -- a low-rank core inside
the frozen basis, ADDED to the gain:

    DeltaW = U [diag(S_eff) + coeff * B A] Vh,   A:(k,r) B:(r,k), B=0 at init

The low-rank part mixes the whole top-r subspace for 2*r*k params (k=LoRA's rank),
and being additive (not * diag(S)) it is S-independent -- the amplification edge is
gone by construction. Diagonal gain unchanged; identity at init from B=0 and g=0.

Wired through benchmark (antipasto_lora_rank, run_id __k suffix), justfile, cost_report,
smoke (green, dplr attaches/trains/round-trips). Arrow code removed; its run results
stay on disk for comparison.

Co-Authored-By: Claudypoo <noreply@anthropic.com>

2026-06-15 20:13:15 +08:00

test_metamath_gsm8k_benchmark.py

tyro and benchmark

2026-04-27 06:23:30 +08:00

test_metamath_smoke.py

variants: replace arrow's dense block with diagonal-plus-low-rank core

2026-06-15 20:13:15 +08:00