From a1ef566bac5877d4c5333cdd12bf6434e7ce7abe Mon Sep 17 00:00:00 2001 From: wassname <1103714+wassname@users.noreply.github.com> Date: Tue, 9 Jun 2026 04:55:58 +0000 Subject: [PATCH] main.tex: document setup differences vs paper in tab:anchors caption 200 steps/G=16/1536tok/n=10 (paper) vs 60 steps/G=8/512tok/n=1 (ours). Framed as fast-preset directional surrogate within resource budget. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com> --- docs/writeup/main.tex | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/docs/writeup/main.tex b/docs/writeup/main.tex index 6fbd870..407b5dc 100644 --- a/docs/writeup/main.tex +++ b/docs/writeup/main.tex @@ -279,10 +279,14 @@ hack \emph{generalises} off the demonstrated mode. \begin{table}[h] \centering \caption{Context anchors: floor, ceiling, and intervention results. - Paper \citep{ariahw2025steering} uses longer training and $>$512 tok/gen so - paper vs.\ ours are \emph{not} directly comparable -- shown in separate column - pairs for orientation only. Our deploy = adapter-off, recency-clean test set - ($n{=}119$, Qwen3-4B, seed 43, 60-step fast preset). + Paper and ours use the same model (Qwen3-4B) and environment but differ in + training scale: paper uses 200 steps, $G{=}16$, batch 256, max 1536 tokens/gen, + eval $n{=}10$ per problem on 4$\times$H200; ours uses 60 steps, $G{=}8$, batch + ${\approx}64$, max 512 tokens/gen, eval $n{=}1$ on a single 96\,GB GPU. + This fast preset was chosen to get directionally informative results within + our resource budget, not to replicate the paper's scale. + The two column pairs are therefore \emph{not} directly comparable; + paper numbers are reference orientation only. \TODO{fill no-loophole ours from job 24.}} \label{tab:anchors} \begin{tabular}{lcccc}