Files
pi-goals/docs/reviews/review.md
T
wassname c74f625b65 Fix newSession context bug + mark unwired loop prompts (external review)
External review (deepseek-v4-pro, docs/reviews/review.md) found:
- Real bug: reviewLoop cast agent_end's ExtensionContext to ExtensionCommandContext
  and called newSession on it, which would crash the "fresh context" path. Now the
  /plan command handler's context (which has newSession) is saved and reused, with a
  graceful in-place fallback (burneikis pattern).
- Dead code: continuation + loopJudge prompts are unused (the autonomous loop is
  intentionally cut). Marked NOT-YET-WIRED in the prompts.ts flow header rather than
  removed, so the full intended flow stays reviewable.
Review also confirmed: no comment bloat, no over-engineering.

Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
2026-06-15 18:40:19 +08:00

3.7 KiB
Raw Blame History

Code review against spec docs/spec/2026-06-15_pi-plan.md.


(A) SPEC MISMATCH — code does not match spec intent

  1. No loop judge (spec §9, §3b). The extension lacks any perturn evaluation that would decide continue/pause; the loopjudge prompt (loopJudgeSystem, loopJudgeUser) is defined but never invoked. No motion.

  2. /goal command missing (spec §7). No handler for /goal (restart loop, pause, resume, clear, status). The only command is /plan.

  3. /subgoal command missing (spec §7). Not implemented.

  4. CancelGoal tool not implemented (spec §5, optional but present in spec). Not a blocker but a gap.

  5. Planphase model selection (D12) not implemented. planDrafting always runs on the default model; there is no sticky perphase model choice, no selection menu, and no persisting of a planphase model reference.

  6. Widget does not flag done goals that lack a signoff log line (spec §7, §6). The widget hides all done goals unconditionally; the visibility guard is missing.

  7. /plan (no args) does not render the tasklist widget (spec §7). showPlan() dumps raw file content via notify; the widget is only set through updateWidget() on other events, not by the command itself.

  8. Injection message role (spec §11). The before_agent_start hook returns a customType message with display: false. The spec demands a late userrole message to avoid systemprompt mutation; the actual message role depends on the pi API and may be system, not user, risking cache breakage.

  9. Missing precompact hook (spec §8). No precompact hook to flush any inmemory state (even just ensuring plan.md is uptodate) before compaction.

  10. Reminder cadence deviates (spec §8a). The spec calls for firing after N filemodifying turns since last plan.md update. The code fires if plan.md is byteidentical between agent starts, which is a coarser proxy.


(B) DEAD/UNUSED CODE

File Lines Reason
src/prompts.ts 128146 loopJudgeSystem and loopJudgeUser exported but never used.
src/prompts.ts 115118 continuation exported but never used (the loop is not built).

(C) OVERLY LONG OR REDUNDANT COMMENTS

The fileheader comments in index.ts (lines 120) and planfile.ts (lines 126) are fairly concise descriptions of the design; they are not excessive. No comment bloat worth flagging.


(D) OVERENGINEERING vs. “super simple” goal

None. The linescanner in planfile.ts is minimal; the getPiInvocation() helper is a straightforward copy from the oracle extension; no unnecessary abstraction or defensive layers.


(E) REAL BUGS

  • cmdCtx.newSession cast risk (src/index.ts:272, 201).
    reviewLoop casts ctx (type ExtensionContext) to ExtensionCommandContext to pass to startExecution, which calls cmdCtx.newSession(...). If the concrete context does not carry that method, it fails at runtime. (In practice the same object may satisfy it, but the cast hides the truth.)

  • showPlan raw content instead of widget (src/index.ts:136143).
    /plan with no arguments shows the file content via ctx.ui.notify, not the structured tasklist widget the spec expects. The widget is rendered separately via updateWidget, but the command does not trigger it, so the output is inconsistent.

No other obvious logic errors; the signoff flow, logging, and parsing work as intended.


Verdict: A clean scaffold for the signoff path, but missing the autonomous loop, /goal command, and planphase model selection means its not yet the “work autonomously” extension the spec describes.