From 4e03f9c07f19ca648155771e3dbd3953e019ce88 Mon Sep 17 00:00:00 2001
From: wassname <1103714+wassname@users.noreply.github.com>
Date: Thu, 18 Jun 2026 20:01:59 +0800
Subject: [PATCH] lora_xs: fix docstring -- A=diag(Sr)Vhr has row norms Sr, not
 orthonormal

External review (GPT-5.5) flagged 'two near-orthonormal bases' as inaccurate:
only B=Ur is orthonormal; A folds the singular values so its rows are scaled.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
---
 src/lora_lite/variants/lora_xs.py | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/src/lora_lite/variants/lora_xs.py b/src/lora_lite/variants/lora_xs.py
index f556d55..d8c9211 100644
--- a/src/lora_lite/variants/lora_xs.py
+++ b/src/lora_lite/variants/lora_xs.py
@@ -13,8 +13,10 @@ the full W, and R (init normal(0, 1e-5)) starts the adapter at ~identity. So the
 trainable tensor is r*r (e.g. r=32 -> 1024 params/layer), hence "extremely small".
 
 The reference folds all singular values into A and leaves B as the raw left singular
-vectors; R sits between two frozen, near-orthonormal bases. Their LLaMA math-tuning
-config sets lora_alpha = r (scale = 1.0) and lr ~ 4e-3 (scripts/run_math_tuning.sh).
+vectors. So R sits between B = Ur (orthonormal) and A = diag(Sr) Vhr (orthonormal rows
+*scaled* by the singular values, so row norms = Sr, not unit) -- the asymmetry is the
+reference's, not a bug. Their LLaMA math-tuning config sets lora_alpha = r (scale = 1.0)
+and lr ~ 4e-3 (scripts/run_math_tuning.sh).
 
 Refs:
   - paper repo: https://github.com/MohammadrezaBanaei/LoRA-XS