mirror of
https://github.com/wassname/grpo_proj2.git
synced 2026-06-27 18:59:39 +08:00
409d9c9425
Replace the AntiPaSTO SVD-diag adapter (δW = U·diag(δS)·Vh) with a parametrized LoRA: per Linear a frozen random A (r×d_in, semi-orthonormal) and a trained B (d_out×r), δW = (B+B_hack)·A. δW stays LINEAR in the trained knob -- the one property the projection needs (a once-extracted V stays a fixed weight-space direction) -- but the whole per-module SVD subsystem is gone (no svd_cached, no SVD_CACHE, no bf16 hash/contiguity friction). B_hack is the SAME shape as B, so the route quarantine is capacity-matched by construction (old-repo route2 diverged from an oversized quarantine sink). - antipasto: wrap() builds A (fp32 orthogonal_, geqrf has no bf16 -> cast after) + B/B_hack zero-init on the layer device; forward y + (B+B_hack)@(A@x). - proj: project_one is dim-agnostic; project_all flattens B.grad (d_out·r) and reshapes back. cos_overlap flattens too. - extract: V from the SVD of stacked B.grad pair-diffs (d_out·r). - train: B/B_hack rename, lora_rank config, per-step aligned table + legend (replaces the sparse tqdm postfix), clean argv via preset/Config defaults. - justfile: collapse smoke-vanilla/smoke-route/fast-vanilla/full-vanilla into smoke/fast/full *ARGS + a `sweep` recipe that fires vanilla|erase|route as pueue jobs. results.py: glob run_*.log (skip loguru verbose logs). Smoke (GPU bf16, all three arms) green: cout~0 one_sided identity holds in the LoRA basis, |Bh|=0 for erase/none, route parks into B_hack. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>