This commit is contained in:
wassname
2026-05-23 14:19:41 +08:00
parent 75a3ec9dd9
commit 0e2c786d4a
7 changed files with 2592 additions and 64 deletions
+29
View File
@@ -4,6 +4,35 @@ Append-only. New entries at the top, date-stamped. Never edit old entries.
# 2026-05-30
## 96GB readiness review fixes
Fresh subagent review found a real silent-failure risk: `v_hack` is not just
model-specific, it is also SVD-basis-specific. The old extractor loaded fp32
while `train.py` loaded bf16, so keys/ranks could match while the basis differed.
Fix: `extract_vhack_grad.py`, `verify_vhack_heldout.py`, and `train.py` now all
use bf16 by default; `v_hack` artifacts save `{model, dtype, v_hack}` metadata;
`train.py` refuses legacy artifacts and checks exact module keys and per-module
rank before first generation.
Also removed a bad smoke convenience: zero-spread reward batches no longer get
random advantages. Dr.GRPO now correctly gives zero advantage when all group
rewards match, so logs cannot look healthy while training on reward-unrelated
noise.
Validated on the 24GB box:
- `just extract-vhack-smoke` via pueue task 73: bf16, 186 modules, 148,032
delta_S scalars, zero-norm=0.
- `just verify-vhack-smoke` via pueue task 74: `frac>0=0.952`, `mean=+0.355`,
`median=+0.363`, target pass.
- one-step canonical train probe via pueue task 75: loaded `out/v_hack_smoke.pt`
with key/rank match OK, completed without legacy artifact. Reward spread was
false and loss/cos/fired were zero, as expected after removing random advantages.
For the 96GB machine, do not start `queue-full` blindly. First run one sequential
gate: `pueue add --immediate --follow -w "$PWD" -o 9 -l "why: gated full probe; resolve: extract+heldout pass, vanilla hacks, projected fires" -- just probe-full-seed 41`.
Only queue 3 seeds after the vanilla probe has nontrivial hack rate.
## Mechanism end-to-end verified on Qwen3.5-0.8B; H4 falsified at this scale
Closed the smoke loop: AntiPaSTO identity (bf16, max_abs_diff=0) -> v_hack
View File
+2369
View File
File diff suppressed because it is too large Load Diff