lora-lite

mirror of https://github.com/wassname/lora-lite.git synced 2026-06-27 15:15:55 +08:00

Author	SHA1	Message	Date
wassname	28d04f1e1d	gitignore: match loraxs_ review scratch; track curated loraxs_review.md Broaden raw/err patterns to raw/err so prefixed scratch (loraxs_raw.jsonl, loraxs_err.txt) is ignored. Add the GPT-5.5 review of the lora_xs variant as the curated artifact. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-19 06:04:25 +08:00
wassname	21cc9a84ee	gitignore: external-review scratch (.pi, raw jsonl, err txt) + papers/md Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>	2026-06-17 20:29:30 +08:00
wassname	e624cd244f	feat: near_zero/near_one init for trainable params (breaks bf16 dead-grad symmetry) Trainable params that were init'd at exact 0 or 1 now use near_zero (N(0,1e-4)) or near_one (1 + N(0,1e-4)) to break bf16 symmetry without meaningfully breaking identity-at-t=0. Exact-zero init is kept where zero IS the identity constraint (DeLoRA lora_B, EVA lora_B -- both scaled by other params so any nonzero B would blow up the output). AntiPaSTO: delta_s and rot_T now near_zero. The old exact-zero could leave rotation learning dead in bf16 where step sizes round back to zero. IA3: lora_g now near_one instead of exact ones. Avoids the bf16 spacing issue around 1.0 where eps_bf16 ~ 7.8e-3 and lr=1e-3 updates were rounding away. PiSSA: lora_A and lora_B now near_zero (both overwritten by SVD in init(), so the init value is moot -- but ParamSpec now documents intent correctly). HRA: lora_U now near_zero (overwritten by symmetric init in init()). ParamSpec: added 'near_zero' and 'near_one' init modes. Default changed from 'zeros' to 'near_zero'. Tests relaxed identity tolerances accordingly.	2026-04-27 15:55:05 +08:00
wassname	7eeaeed206	Verify all variants on bnb 4bit/8bit; HRA paper-faithful rewrite - Test all 6 variants against bnb.Linear8bitLt + Linear4bit in smoke - bnb-friendly (LoRA, IA3, HRA, DeLoRA): identity err <= 2.4e-4 - bnb-incompatible (PiSSA, DoRA): fail-loud TypeError as expected - HRA: rewrite to paper-faithful input-side reflections (h <- (I-2vv^T)h), fixing previous broken output-side formulation - IA3: bypass dtype upcast for bnb (params stay fp16/quantized) - DeLoRA: explicit type check rejecting non-nn.Linear (incl. bnb) - adapter: special-case bnb param assignment via .data - Re-verified Qwen0.6B HRA probe: drop=20.7%, id_err=0, reload=0	2026-04-26 18:08:06 +08:00
wassname	f2d9021511	ci: add publishable check workflow	2026-04-26 17:09:47 +08:00
wassname	69bf5f4e44	test: prove adapter training paths	2026-04-26 17:00:39 +08:00
wassname	de97724b65	init	2026-04-26 14:10:18 +08:00

7 Commits