From d6b242818a4240697e95dd92e1a542cf1eb4bc57 Mon Sep 17 00:00:00 2001 From: wassname <1103714+wassname@users.noreply.github.com> Date: Sun, 14 Jun 2026 19:20:35 +0800 Subject: [PATCH] justfile: lr=5e-3 for all antipasto_* cores in bench-variant The small-param antipasto family (gain/block/ablate/corda) all need the higher lr to clear the bf16 round-to-nearest floor, not just antipasto. Glob the case. Co-Authored-By: Claudypoo --- justfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/justfile b/justfile index d516e0a..b337ac1 100644 --- a/justfile +++ b/justfile @@ -88,7 +88,7 @@ bench-variant model variant steps="5000": delora) lr=1e-3 ;; ia3) lr=5e-3; target='(k_proj|v_proj)$' ;; ia3_ff) lr=5e-3; target='(down_proj)$' ;; - antipasto) lr=5e-3 ;; # small params need higher lr + antipasto*) lr=5e-3 ;; # small params (gain/block) need higher lr; covers all antipasto_* cores esac exec uv run --extra benchmark python scripts/metamath_gsm8k_benchmark.py \ --model '{{model}}' \