lora-lite

mirror of https://github.com/wassname/lora-lite.git synced 2026-06-27 17:48:59 +08:00

Files

T

copilot 185eb29c70 fix v2 review bugs + add EVA, AntiPaSTO

DeLoRA: per-input-channel wnorm buffer (not scalar Parameter), forward
matches peft (x*wnorm @ A.T then per-rank scale (lambda/r)/(An*Bn)).
Smoke: 89.7% loss drop (was 35.8%).

HRA: symmetric repeated-column init (PEFT-style) instead of zero gate.
Adjacent Householder pairs cancel exactly so R=I at t=0, and U receives
gradient from step 0 (no dead-grad). Even r required.

IA3: split into two variants. ia3 stays output-side (k_proj/v_proj);
new ia3_ff is input-side (down_proj/fc2), matching peft is_feedforward.

Config: dropout field removed (never honored by any variant).

PiSSA: adapter.save records base-weight fingerprint per target;
adapter.load recomputes init then verifies fingerprint -> fails loud
when reloaded onto a different base.

EVA (new): data-driven init via group_init + calibration_data. Top-r
right singular vectors of pooled layer-input activations -> lora_A
(buffer, frozen); only lora_B trains. Stress-tests group_init API.

AntiPaSTO (new): SVD steering with frozen U,S,Vh,W_res and learnable
delta_s (per-singular-value bias) + rot_T (block-diagonal Cayley
rotation on V or U). Lite port of antipasto3 SVD adapter.

ParamSpec: as_buffer field + make_tensor() for buffer registration.
adapter.attach honors as_buffer with register_buffer; detach cleans
both _parameters and _buffers.

Smoke covers all 8 variants: identity at t=0, save/load round-trip,
gradient-driven loss drop. EVA gets dedicated test for calibration
data path. ALL PASS including bnb 4/8-bit path.

2026-04-26 19:41:59 +08:00

smoke.py

fix v2 review bugs + add EVA, AntiPaSTO

2026-04-26 19:41:59 +08:00

test_lora_lite.py

feat(hra): add Householder Reflection Adaptation, hook-only/bnb-friendly + Qwen proof

2026-04-26 17:58:56 +08:00