Broaden raw*/err* patterns to *raw*/*err* so prefixed scratch
(loraxs_raw.jsonl, loraxs_err.txt) is ignored. Add the GPT-5.5 review of
the lora_xs variant as the curated artifact.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
- antipasto_rot: add rotate_basis="both" (independent V+U Cayley rotations),
run_id suffix __rotU/__rotboth so ablation arms get their own output dirs
- justfile: thread rotate_basis through bench-variant
- corda/eva: padding-mask fix in calibration capture + bf16-tight residual
- README: fill PiSSA/DoRA/CorDA/ASVD/ablate/dplr/rot rows; record the
metric-axis ablation (C=I 56.0 > diag-C 55.6 > full-C 54.7) and the
rotation ablation (V 57.2 > U 56.5 > both 55.6) conclusions
- docs/reviews: external ref-checks + deepseek/gpt reviews of the cores
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Replace antipasto's rotation/Cayley with a bounded 1+ELU gain and split the
S-space idea into four interpretable PiSSA-style cores (frozen U/S/Vh, small
trainable core):
- antipasto: S_eff = S*(1+ELU(coeff*g)). exp-bounded attenuation, linear
amplification (constant gradient, no runaway). g=0 -> exact identity.
- antipasto_rot: keeps the block-Cayley rotation as a separate variant for
cost comparison (its per-forward solve is the 72ms vs 36ms gap).
- antipasto_ablate: contractive (I - a c c^T) diag(S), eigenvalues in [0,1],
cannot blow up. Optional cov_orient (CorDA) basis.
- antipasto_corda: covariance-oriented oblique projector P = Vh C^{-1/2}, the
data-energy basis rather than the weight-gain basis. 1+ELU gain.
Add scripts/_cost.py + scripts/cost_report.py: one-row-per-variant cost table
(trainable params, peak GPU mem, fwd/bwd ms, added MACs/tok, group_init ms).
Wire all four into the benchmark, smoke test, and __init__ exports.
External review (DeepSeek-v4-pro, docs/reviews/) verified the math; acted on
its one real point (corda g now inits to zeros for exact identity).
Co-Authored-By: Claudypoo <noreply@anthropic.com>