Retract the "routeV deadlocks at first generate()" finding from d96367c. The
server-side `modal app logs` show the killed routeV smoke had actually run training
steps 0-3 (real rewards, ||delta_S_hack||=3.23, coherent generations) and was inside
the 24-prompt FINAL EVAL when I stopped it -- a deadlocked-at-first-generate process
cannot produce step 1/2/3 results. The "freeze" was the local `modal run > log`
capture block-buffering the subprocess stdout; the run was healthy the whole time.
Fix: PYTHONUNBUFFERED=1 in _run_train env so the local stream is live, and monitor
via `modal app logs <app-id>` (server-side truth). Corrected the app.py comment and
replaced the README "known issue" with the buffering gotcha. routeV runs fine on
Modal -- the routeV sweep is viable, no torch-2.7 debug needed.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Data fix: the read-only LeetCode jsonls (44MB, tracked in the rl-rewardhacking
submodule) now mount from the local checkout into the image (add_local_dir,
copy=False) instead of the Volume. A Volume mount/reload race FileNotFound'd
them mid-sweep even though they were committed; versioning the dataset with the
code removes that failure mode. Volume now carries only mutable dirs. Verified:
both a vanilla warm and a routeV smoke load data fine on the new image.
Correction: 2873b37's message claimed "the smoke on pinned 5.10.2 clears the
deadlock point" -- it did NOT, the smoke hung. And transformers is not the cause:
on this exact 5.10.2 image, vanilla completes generate (warm, 6.8 min, exit 0)
while routeV deadlocks at its first rollout generate(). Same image, same attn,
same data -- the hang is routeV-specific (v_grad extraction's CUDA state x
flash-attn first-generate on torch 2.7.1; local box runs routeV fine on 2.8).
Known-issue section + corrected app.py comment record this. Local box produces
the canonical routeV runs; Modal is proven for vanilla.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
The generate() hang was floating transformers @ main (a later commit), not the
attn backend -- confirmed: v60 ran on an earlier main with flash, and the smoke
on pinned 5.10.2 clears the deadlock point. Revert the VGROUT_ATTN=sdpa override
(app.py) and the env knob (train.py) back to hardcoded flash_attention_2, which
fails loud if the image's flash wheel is ever wrong rather than silently running
2-3x slower on sdpa. Pin transformers to the released 5.10.2 (patch line of v60's
5.10.0.dev0); uv.lock keeps the exact commit for the local box.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Floating @ main let a later main commit hang generate() (the other agent's
deadlock). The local box runs 5.8.0.dev0; uv.lock pins the exact commit, the
image uses the released 5.8.0 wheel of the same line. Qwen3-4B needs no
main-only feature.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Adds env override VGROUT_ATTN (default flash_attention_2, so local behavior is
unchanged; app.py sets sdpa on Modal). Tested to isolate the Modal generate()
deadlock: it hangs at the first generate under BOTH flash_attention_2 and sdpa,
so the hang is NOT the attention backend -- it's in the generation loop, suspect
the cache-frozen image's transformers-main commit differing from local's working
5.8.0.dev0. Diagnosis + fix path in task #212. Local n=3 runs proceed meanwhile.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
launch.py imports `app` from app.py, which registers app.py's @local_entrypoint
`main`; launch.py defining its own `main` raised InvalidError(Duplicate local
entrypoint). So launch.py had never actually run -- the earlier vanilla verify
was via app.py directly. Invoke: modal run modal/launch.py::fanout [--only N].
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Old JOBS fired --intervention=route2 (dead flag after the routeV rename) on the
pre-v2 manifest -- half the containers would have errored on argv parse. Replace
with the n=3 keynote set generated from ARMS x SEEDS: vanilla, routeV real-V
per-rollout, routeV per-token, random-V(157), placebo(vampire). Tag stems match
the local pueue twins so Modal and local cross-replicate. id 1 = canary
(seed-42 vanilla). Fix app.py::smoke route2->routeV and the subprocess modal
binary (not on PATH; resolve next to sys.executable). v2 eval rides in via the
runtime-mounted src/.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
Fire the paper sweep as independent H100/A100-80 containers instead of
serial pueue runs. One Volume caches model + svd + out/; train.py runs
unmodified (torch 2.7 + Dao flash-attn wheel, code mounted at runtime).
Verified: vanilla 60-step reproduces the local baseline. Skill at
~/.claude/skills/modal documents the patterns.
Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>