- /lgtm <id> and /lgtm * no longer hard-error when the agent skipped
lgtm_ask; the human is the final gate, so they get a confirm dialog
with explicit override copy instead of an error.
- /lgtm * now spans every open task (READY + ACTIVE + PENDING) and
shows a grouped preview before signing off.
- Each task row (widget + TaskList) is prefixed with a coloured
[READY]/[ACTIVE]/[PENDING]/[DONE] tag so signoff-readiness is
legible at a glance instead of decoded from emoji pipeline + colour.
Reviewer feedback: the LGTM extension's epistemic core is good but UX is too
ceremonial — every task forced through lgtm_ask + /lgtm even bookkeeping like
"monitor pueue 30". Two-tier split:
- Tasks: agent-managed. TaskUpdate(status=completed) now allowed when no lgtm
evidence is stored. Trivial subtasks lead up to verification without ceremony.
- LGTMs: significant claims. lgtm_ask still triggers robot review; once evidence
is stored, completion is locked to /lgtm so the gate can't be bypassed.
Other UX:
- TaskList output grouped: Active / Awaiting sign-off / Pending / Completed.
- New getDisplayStatus(task) derives awaiting_signoff from pending_approval.
- Widget header shows N awaiting sign-off count.
- /lgtm accepts multiple ids: /lgtm 1 2 3 (also #1, commas).
- lgtm_ask field descriptions encourage one short sentence per field — keep
thinking discipline, drop verbosity.
- SYSTEM_REMINDER nudges progress updates and cleanup of completed/irrelevant
tasks, not just lgtm_ask.
Also includes pending rubric extension on RobotReviewRecord.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- falsification_test: concrete runnable command + expected output if
claim true vs false + why test can't accidentally pass under failure
- failure_mode_2 now explicitly asks for subtle/silent/null-hypothesis
failure, not just "second most likely"
- nudges toward: null hypothesis, silent fails, env mismatch
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add PI_TASKS_DEBUG=1 env flag to trace RPC communication to stderr
- TaskOutput/TaskStop now accept agent IDs (resolve via agentTaskMap)
- TaskGet filters completed blockers (consistent with TaskList)
- TaskGet shows non-empty metadata
- Soften TaskExecute description to not deter agents from using it
- TaskExecute success message guides agents to use TaskOutput
- Add promptGuidelines to prevent duplicate agent spawns
- Update changelog