From 3ae1e8376d20644adb0fb70aba9f1d531f0ba376 Mon Sep 17 00:00:00 2001 From: wassname <1103714+wassname@users.noreply.github.com> Date: Fri, 5 Jun 2026 05:01:18 +0000 Subject: [PATCH] =?UTF-8?q?journal:=20close=20(a)=20WATCH=20=E2=80=94=20pl?= =?UTF-8?q?acebo=20endpoint=20refutes=20route=20directionality=20(job=2086?= =?UTF-8?q?)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com> --- RESEARCH_JOURNAL.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/RESEARCH_JOURNAL.md b/RESEARCH_JOURNAL.md index 1684d7d..d0954be 100644 --- a/RESEARCH_JOURNAL.md +++ b/RESEARCH_JOURNAL.md @@ -2,6 +2,23 @@ Append-only. New entries at the top, date-stamped. Never edit old entries. +## 2026-06-05 (f) — VERDICT closing the (a) WATCH: route's gate is NON-directional (placebo endpoint, job 86 step 60) + +Closes the 2026-06-05 (a) WATCH ("directionality claim at risk"). Job 86 finished all 60 steps; read its per_mode_deploy.json +(out/runs/20260604T231926_fast_routing2_seed41_route2_placebo_nullcity_s41/per_mode_deploy.json). + +| mode | placebo (null_city) deploy_hack | real-v full-4-mode route2 deploy_hack | vanilla deploy_hack | +|------|--------------------------------|---------------------------------------|---------------------| +| run_tests | 0.000 | 0.000 | 0.875 | +| file_marker | 0.000 | 0.000 | 0.469 | +| sentinel | 0.000 | 0.000 | 0.042 | +| aggregate | 0.000 (solve 0.531) | 0.000 (solve 0.625) | 0.359 (solve 0.422) | + +- [verdict] placebo per-mode == real-v per-mode (both pin every mode to 0.000 deploy hack), while vanilla is 0.359. This is the journal's "if placebo per-mode ~= real => directionality REFUTED, do NOT bury" branch. An arbitrary (hkgap<0, non-discriminative) direction suppresses exactly as well as the extracted v_hack. +- [mechanism] calibrated-tau cuts the cos cloud at its midpoint regardless of v, so the gate routes ~60-78% of grad energy into the deletable quarantine whatever direction it is built from; late-emergent hacks route by gradient magnitude/recency, not by alignment with v. Suppression is discarded-knob absorption, not hack-direction specificity. +- [paper status] already reflected honestly: main.tex tab:ablation placebo row 0.000/0.531 filled, surrounding text says "the placebo also reaches zero deploy hack, so route's gate is [non-directional]". Directional specificity now rests on the ERASE arm (erase subtracts proportional to cos(g,v), so real_v << placebo_v would show directionality) -- jobs 93/94 queued. Random-V is the second non-directionality check (row still TODO, jobs 94/106). +- [contribution reframe, unchanged from (a)] NOT "we found the hack direction" but "gradient routing into a deletable knob suppresses late-emergent hacks direction-agnostically". The A5 held-out generalisation (zero held-out labels) still stands as a no-cheat demonstration; its mechanism is the knob, not v_hack specificity. + ## 2026-06-05 (e) — A5 no-cheat leak FIXED at the gate (teacher-only anchor) + unit test; airtight rerun queued (job 111) Entry (d) found held-out hacked_E is not exactly 0 (<=1.1% detector false positives). Today, traced it to a real label leak and fixed it.