Files
lora-lite/docs/audit/SUMMARY.md
T
wassname d0b4c52740 External review: per-variant audit + design notes
- Two acpx external reviews (codex + opencode):
  * docs/audit/variants_review.md: per-variant paper-vs-impl audit
  * docs/audit/design_review.md: peft EVA / baukit / antipasto3 vs lora-lite
  * docs/audit/SUMMARY.md: aggregate verdicts + 3 risks + 5 follow-ups
- docs/refs/: peft_eva.py, peft_eva_finetuning.py, baukit_nethook.py,
  antipasto3_svd_adapter.py for offline reference

Findings: LoRA clean; PiSSA/DoRA/IA3/HRA/DeLoRA have documented partial deviations.
Top risks: init/grad tradeoffs hidden by coarse tests; qwen probe lacks strict
identity tol; IA3 target placement untested.
2026-04-26 19:01:29 +08:00

43 lines
2.8 KiB
Markdown

# External-Review Summary
Two independent reviews via `acpx` external models. Full reviews:
- [docs/audit/variants_review.md](variants_review.md) — per-variant paper-faithfulness audit
- [docs/audit/design_review.md](design_review.md) — peft EVA / baukit / antipasto3 vs lora-lite design
## Per-variant verdict
| variant | match | bugs found | confidence |
|---|---|---|---|
| lora | Y | none material | High |
| pissa | Partial | bf16/Qwen init err 0.31; deviation `alpha==r` only in inline comment; residual not in saved adapter | Medium |
| dora | Y | possible denominator-gradient mismatch with paper's "cost-saving" variant | High |
| ia3 | Partial | targets q/v not paper's k/v/ffn-down; deviation documented but not tested | Medium |
| hra | Partial | gate=0 init -> dU/dx=0 first step (lora_U dead); not orthogonal when gate != 1 | Medium-Low |
| delora | Partial | no Eq.9 frozen-copy init; lambda0=0 -> A/B dead grad; lambda0=0.1 breaks identity | Medium |
## Three biggest risks (reviewer's words)
1. **Initialization vs gradient-flow tradeoffs are hidden by coarse tests.** HRA's `lora_U` and DeLoRA's `A/B` can be initially dead while `grad_nonzero=True` still passes (because *some* lora_* param has nonzero grad).
2. **Qwen probe pass criteria do not enforce paper identity.** PiSSA shows `id_err=0.31`, DeLoRA `id_err=0.72`, but log says PASS.
3. **Target semantics under-tested.** IA3's documented k/v/ffn deviation is never exercised by a positive test.
## Design recommendations
| ref | verdict | impact |
|---|---|---|
| peft EVA | PARTIAL — add `calibrate(model, dataloader, cfg)` (~50 lines) | +50 lines, additive |
| baukit nethook | SKIP — current 5-line hook registration is simpler | 0 |
| antipasto3 SVD | ADOPT concept (learnable delta_s) — no code change now | 0 |
## Recommended follow-up tasks (need user approval before implementing)
A. **Per-param gradient probe**: extend smoke to assert grad on *each* lora_* param at step 0. Catches HRA/DeLoRA init-dead-param bug.
B. **Per-variant identity tolerance in qwen probe**: PiSSA/DeLoRA need a stricter check (or relative tol against `||y_base||`) instead of "passes if id_err < some constant".
C. **IA3 paper-faithful test row**: add one Qwen probe configuration with `target_names=k_proj|v_proj|down_proj` to exercise the documented IA3 placement.
D. **PiSSA equivalence test against `peft.PiSSA`**: same seed + alpha=r, compare `B@A` reconstruction. Adds `peft` to test extras only.
E. **EVA variant**: implement minimal `calibrate()` per design review (~50 lines). Optional, but provides our first data-driven init variant for the user's stated interest.