fix: falsification_test is agent-reported results, not a human procedure

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-27 16:46:17 +08:00 · 2026-04-17 05:59:44 +08:00
parent f29d3a45be
commit e86c19b11a
1 changed files with 2 additions and 2 deletions
@@ -339,7 +339,7 @@ After this, task enters pending sign-off state — only completable via /lgtm <i
 - **evidence**: Auditable proof — command output, table, file path, link
 - **failure_mode_1**: Most likely way this could be wrong despite evidence
 - **failure_mode_2**: Most perverse or sneaky failure -- one that looks like success superficially, corrupts silently, or only breaks under specific conditions (scale, time, edge case). E.g. feature active but wrong mechanism, works in tests but degrades in prod, correct output for wrong reason.
- **falsification_test**: Concrete check that distinguishes your hypothesis from failure modes. Can be a command, a procedure, a log snippet to look for, or an experiment observation. Include: what to run or observe, what you expect if claim is true, what you expect if a failure mode is real, and why this check can't accidentally pass under the failure. Think especially about: null hypothesis (feature isn't active at all), silent failures (error swallowed, fallback triggered silently), and env mismatch (passes in test, broken in prod).
+- **falsification_test**: What you ran and what you got -- presented so both you and the human can sanity-check it. State: what you ran (command, experiment, log inspection), the actual output or result, and why that result could not occur if a failure mode were real. Must be traceable: include file paths, log snippets, counts, or commit. Human should be able to verify without re-running anything.
 - **evidence_files** (optional): File paths human should inspect -- must exist
 - **remaining_uncertainty** (optional): What's NOT tested, known limitations, deferred edge cases`,
    parameters: Type.Object({
@@ -347,7 +347,7 @@ After this, task enters pending sign-off state — only completable via /lgtm <i
      evidence: Type.String({ description: "Auditable proof with full reproducibility: exact command run and its output, commit hash, config/seeds used, output file paths. Must be re-runnable by the human. 'I wrote X' is not evidence -- 'I ran X and got Y' is. Include counts, snippets, test output." }),
      failure_mode_1: Type.String({ description: "Most likely way this could be wrong despite evidence" }),
      failure_mode_2: Type.String({ description: "Most perverse or sneaky failure: looks like success superficially, corrupts silently, or only breaks at scale/time/edge case. E.g. correct output for wrong reason, feature active but wrong mechanism, passes tests but degrades in prod." }),
-      falsification_test: Type.String({ description: "Concrete check (command, procedure, log inspection, or experiment observation) that distinguishes your hypothesis from failure modes. State: what to run/observe, expected result if claim is true, expected result if a failure mode is real, why this can't accidentally pass under the failure. Cover: null hypothesis, silent fail, env mismatch." }),
+      falsification_test: Type.String({ description: "What you ran and what you got, presented so both you and the human can sanity-check it. State: what you ran (command/experiment/log check), the actual output or result, and why that result could not occur if a failure mode were real. Must be traceable: include file paths, log snippets, counts, or commit. The human should be able to verify without re-running anything." }),
      evidence_files: Type.Optional(Type.Array(Type.String(), { description: "File paths to inspect (must exist)" })),
      remaining_uncertainty: Type.Optional(Type.String({ description: "What's NOT tested, known limitations, edge cases deferred. Be honest about scope boundaries." })),
    }),