diff --git a/docs/AFK_CHECK.md b/docs/AFK_CHECK.md new file mode 100644 index 0000000..0be3a26 --- /dev/null +++ b/docs/AFK_CHECK.md @@ -0,0 +1,49 @@ +# AFK hourly check — current protocol + +Paste this (or just point me at it) for the hourly check. Supersedes the old +A1/A2-keynote + A5-harvest checklist, which closed 2026-06-04 (see below). + +## Standing checks (every hour) + +1. **GPU idle while queued?** `pueue status`. If idle with jobs Queued, investigate + + unblock. +2. **New Failed/Killed?** (ignore old killed 78). Read `pueue log {ID} --full`, form + 3 hypotheses (likely / subtle / I-was-wrong), weight them, fix root cause, requeue + with `why:`/`resolve:`. No blind retry. +3. **Running job health** — discriminating review, not did-it-finish: reward not + collapsed, lp_s stable (~-0.4), no divergence tripwire, deploy-eval matches the + arm's expectation. + +## THE priority: route2 directionality mystery (#196) + +Is route2's deploy-hack suppression directional (H4: needs the hack direction) or +mechanical (H2: alignment-agnostic quarantine-absorption)? The batch is staged +interleaved (one of each family per tier): + +- **Haar** (114/118/122, `--route2-random-v-seed`): out-of-subspace null (cos~1/sqrt(d) + by concentration, NOT a cleaner placebo). Tests "must v_grad be in-subspace at all?" +- **semantic placebo** (115/119/123 vampire, 119/120... bacon/blue): in-subspace + arbitrary directions. Tests "must it point at the hack specifically?" Maps + suppression-vs-alignment as a scatter. +- **null_city n=3** (117/121 s42/s43): is the deploy-hack=0.000 placebo result robust + across seeds or an s41 fluke. +- **erase directionality** (116 real-v, 120 placebo): erase projects with magnitude + ~cos(g,v), so direction MUST matter there if it matters anywhere. + +As each finishes: pull deploy hack/solve, and (for the scatter) each placebo's per-module +|cos| with the hack dir. Verdict logic: +- all suppress regardless of alignment, incl. Haar => **H2 mechanical**. +- suppression tracks |cos|, or Haar fails to suppress => **H4 alignment**. + +Cosine is correlational; the ablation run is the causal test. Commit findings to the +journal. Don't re-derive the no-cheat E-by-mode table unless an A5 run changes — it's +confirmed (journal 2026-06-05 (h)) and gated by `verify_gate_anchor.py`. + +## Background paper artifacts (lower prio, already in-flight, DON'T re-do) + +- A1/A2 keynote (#173): CLOSED. tab:keynote is n=3 both arms with paired t-test. +- A5 generalisation (#185): CLOSED; airtight no-cheat rerun queued (111-113). +- A4 long-run (#184): matched-beta pair 100/101 queued. +- #186 on-policy emergence: job 87 (running) / 105 (route2 toff40, queued). + +Commit progress. Don't stop to ask — autonomous judgement; if unsure, commit and continue.