G=6 + logits_to_keep OOM fix, generalization constraint, handover rewrite

train.py: pass logits_to_keep=L_c+1 to model() at all three logp call
sites + the ref-via-zero-delta helper so HF Qwen3's lm_head only runs on
completion-side hidden states; saves ~33% at the 4 GiB step-17 OOM site.
full preset G=8 -> G=6 for a further ~25% B reduction at every act site.
Column names in the streamed TSV row shortened so header and values
share the same 8-char tab stop.

spec.md: documented the v_hack generalization constraint as load-bearing
methodology — pairs.py must NOT be tuned post-hoc to match RL-emergent
hacks, or the H1 generalization claim collapses.

handover.md: rewritten for current state (G=6, post-grader-fix, Qwen3-4B).
Documents the four probe gates, hyperparameters table, and methodological
constraints. justfile gains a SWEEPS comment block clarifying probe vs
queue-full ordering. .gitignore picks up .venv, *.log, /tmp/, cache dirs.

RESEARCH_JOURNAL.md: 2026-05-24 (b) entry covers the OOM diagnosis, fix,
pooled cross-run trend analysis (LR is fine, signal underpowered at n=17
but directionally consistent), and the generalization correction.
This commit is contained in:
wassname
2026-05-24 05:03:04 +00:00
parent 973b9407b5
commit 87a2b48784
6 changed files with 471 additions and 185 deletions
+6
View File
@@ -1,9 +1,12 @@
.claude/
.venv/
/out/
/data/
/log/
/logs/
/svd_cache/
/tmp/
*.log
# vendored upstream reference repos cloned for grep access (see RESEARCH_JOURNAL.md)
/docs/vendor/
@@ -12,3 +15,6 @@
*.egg-info/
__pycache__/
*.pyc
.pytest_cache/
.ruff_cache/
.mypy_cache/