mirror of
https://github.com/wassname/lora-lite.git
synced 2026-06-27 16:15:50 +08:00
feat(hra): add Householder Reflection Adaptation, hook-only/bnb-friendly + Qwen proof
This commit is contained in:
@@ -76,6 +76,7 @@ Activation-aware variants implement `group_init(model, targets, cfg, calibration
|
||||
|---|---|---|
|
||||
| IA3 | Done. Output gate `y * g`, identity at `g=1`. | Qwen proof in latest probe. |
|
||||
| DoRA | Done for fp layers. Reads dense `weight` to compute `||V||_c`; quantized layers fail fast. | Qwen proof in latest probe. |
|
||||
| HRA | Done. Output-side Householder with identity gate; hook-only -> works on bnb. | Qwen proof in latest probe. |
|
||||
| SSVD / PiSSA-family | Fits weight-SVD init path. | reconstruction/identity invariant plus train proof. |
|
||||
| HRA / OFT / ROAD | Interesting, but weight-transform semantics need clearer hook-only formulation. | pseudocode first, then rotation/non-dead-code invariant. |
|
||||
| OFT / ROAD | Block-diagonal rotations; weight-transform semantics need clearer hook-only formulation. | pseudocode first, then rotation/non-dead-code invariant. |
|
||||
| S-steer / AntiPaSTO | Should use `group_init` and activation evidence. | calibration consumed, hooks removed, load works without calibration. |
|
||||
|
||||
@@ -38,6 +38,7 @@ The core bet is that adapter variants should own the relationship between `(x, l
|
||||
| DeLoRA | done | `src/lora_lite/variants/delora.py` |
|
||||
| IA3 | done | `src/lora_lite/variants/ia3.py` |
|
||||
| DoRA | done, fp-only | `src/lora_lite/variants/dora.py` |
|
||||
| HRA | done | `src/lora_lite/variants/hra.py` (output-side Householder, hook-only -> bnb-compatible) |
|
||||
| Smoke tests | done | `tests/smoke.py` |
|
||||
| bnb minimal forward smoke | done | `Linear8bitLt` and `Linear4bit` pass on CUDA with `just bnb-smoke` |
|
||||
|
||||
@@ -116,6 +117,7 @@ Follow-up tasks 80 (lora/pissa/delora/ia3 at 16 steps) and 81 (dora at 16 steps)
|
||||
| delora | 2 | 20482 | 0.3281 | 0.3125 | 5.261 | 4.823 | 8.322 | 0.06303 | 15.1 | 0 | `outputs/qwen_train_probe/delora_adapter.pt` |
|
||||
| ia3 | 2 | 3072 | 0 | 0.375 | 5.25 | 4.473 | 14.79 | 0.463 | 5.926 | 0 | `outputs/qwen_train_probe/ia3_adapter.pt` |
|
||||
| dora | 2 | 23552 | 0 | 0.3203 | 5.25 | 2.439 | 53.54 | 1.776 | 7.44 | 0 | `outputs/qwen_train_probe/dora_adapter.pt` |
|
||||
| hra | 2 | 12290 | 0 | 0.3438 | 5.25 | 4.07 | 22.47 | 0.05225 | 4.735 | 0 | `outputs/qwen_train_probe/hra_adapter.pt` |
|
||||
|
||||
Failure-mode interpretation:
|
||||
|
||||
|
||||
Reference in New Issue
Block a user