mirror of
https://github.com/wassname/pi-lgtm.git
synced 2026-06-27 15:16:15 +08:00
3.9 KiB
3.9 KiB
Robot Review Lane
Goal
Add a separate review artifact for fresh-perspective subagent review without weakening the existing human /lgtm sign-off gate.
Scope
In: task schema, task tool UX, widget/task badges, README/tests, Pi-native robot-review harness hardening. Out: provider routing or third-party subagent package integration.
Requirements
- R1: Tasks can store a robot review separately from human sign-off evidence. Done means: a task can contain both
lgtm_askevidence and robot observations without conflict. - R2: Robot review is observational only. Done means: the tool schema and help text ask for observations, not recommendations or editorial.
- R3: UI exposes review lanes distinctly. Done means: task list/widget/details show tool/robot/human review badges.
- R4: Human
/lgtmremains the only completion path. Done means: no robot review path can complete tasks. - R5: Automatic Pi robot review must be operationally robust. Done means: the child reviewer has a bounded timeout, uses a deterministic Pi invocation, and reports infra failures clearly enough to diagnose without hanging the main tool call.
- R6: The subprocess harness is covered by focused tests. Done means: there are tests for invocation selection, timeout/abort behavior, and assistant-output parsing.
Tasks
- T1 (R1, R2, R3, R4): Add robot-review storage and tool.
- steps: update task typing helpers, register
robot_review_ask, thread robot review metadata through task views - verify:
npm test -- --runInBand - UAT: "when a task has lgtm evidence and robot observations, I observe both badges and
/lgtmstill controls completion"
- steps: update task typing helpers, register
- T2 (R3): Update README and examples.
- steps: document badges and robot review workflow
- verify:
rg -n "robot_review_ask|🤖|🛠" README.md - UAT: "when I read the README, I observe a distinct robot review lane"
- T3 (R5): Harden the Pi-native robot-review subprocess runner.
- steps: add timeout handling, replace fragile self-reinvocation logic with deterministic command resolution, improve failure messages
- verify:
npx vitest run test/robot-review-runner.test.ts - UAT: "when the child reviewer hangs or pi is not resolvable, lgtm_ask returns a bounded failure instead of hanging forever"
- T4 (R6): Add focused harness tests.
- steps: extract/mock subprocess runner boundaries and cover timeout, parse, and command resolution behavior
- verify:
npx vitest run test/robot-review-runner.test.ts test/robot-review.test.ts - UAT: "when I run the focused tests, I observe the subprocess path itself is covered"
Context
- Existing schema uses
pending_approvalas the human sign-off gate. - Current UI already appends
👀for pending human sign-off; extend rather than replace the completion rule.
Log
- The least disruptive model is additive metadata plus badge rendering, not replacing the task lifecycle.
- The repo's full Vitest suite already has drift unrelated to this feature, so focused verification is needed to isolate new behavior.
- A Pi-native reviewer stage matches the official subagent example better than ACP/external CLIs, but it makes harness reliability part of the approval path and therefore needs explicit timeout and invocation hardening.
- A deterministic
picommand plus an explicit timeout is simpler and more portable than trying to reconstruct the current host entrypoint fromprocess.argv.
TODO
- Optional future work: add an orchestrated cross-model reviewer via
external-reviewor ACP.
Errors
| Task | Error | Resolution |
|---|---|---|
| T1 | npm test failed with 17 pre-existing assertions unrelated to robot review |
Verified with npm run lint, npm run typecheck, and focused npx vitest run test/review-badges.test.ts instead |
| T3 | process.argv[1]-based self-reinvocation was fragile in extension-hosted contexts |
Replaced with PI_LGTM_PI_BIN override or plain pi, then added focused runner tests |