External review (deepseek-v4-pro, docs/reviews/review.md) found: - Real bug: reviewLoop cast agent_end's ExtensionContext to ExtensionCommandContext and called newSession on it, which would crash the "fresh context" path. Now the /plan command handler's context (which has newSession) is saved and reused, with a graceful in-place fallback (burneikis pattern). - Dead code: continuation + loopJudge prompts are unused (the autonomous loop is intentionally cut). Marked NOT-YET-WIRED in the prompts.ts flow header rather than removed, so the full intended flow stays reviewable. Review also confirmed: no comment bloat, no over-engineering. Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
pi-plan
A pi extension for plan-driven, goal-tracked work in one
plan.md. Set up goals (with evidence and failure modes) in plan mode, work them, and sign a goal
off only when a read-only subagent has checked the evidence.
Successor to pi-lgtm, kept deliberately small: about burneikis/pi-plan plus the additions, goals with evidence, a sign-off check, a widget, and a reminder.
The form guides; it does not gate. The agent edits plan.md with its normal Edit tool. The one
blessed tool is CompleteGoal, which runs the sign-off check and records the result. The reminder,
the injected plan summary, and git/widget visibility carry the process. It trusts the agent's
judgement rather than guarding it.
Install
pi install npm:@wassname2/pi-plan
Or run without installing:
pi -e npm:@wassname2/pi-plan
Use
/plan add CSV export to the report view
- Plan. The agent explores read-only and writes goals into
plan.md(see format below). - Review. You get a menu: Ready, Edit (ask the agent to revise), Open in
$EDITOR, or Cancel. On Ready you choose whether to keep the current context or start fresh and compacted. - Work. Each turn the active goal is injected (so it survives compaction) and a reminder nudges
the agent to keep
plan.mdcurrent and work autonomously. When a goal'sdone_whenis met the agent callsCompleteGoal, which runsverifyand a read-only judge and, on accept, marks it done and logs it.
Other commands: /plan (print the plan), /plan clear (empty plan.md, history kept in git),
/plan judge <model-ref> (use a specific model for the sign-off judge; default is your current
model).
plan.md format
One file holds the objective, the goals, and a short append-only log.
# Plan: ship the cache layer
## Goal: Implement cache layer
<!-- id: cache-layer-1 -->
status: active
done_when: p95 < 50ms on bench-X. If wrong: timeouts in load-test.log
verify: pytest tests/cache -q && python bench/p95.py --max-ms 50
failure_modes:
- cache silently bypassed (hit-rate ~0, latency ok by luck)
- bench too small to exercise eviction
- [x] wire cache client
- [ ] eviction policy
## Log
- 2026-06-15 14:02 cache client wired; eviction next
- A goal is a
## Goal:header with an<!-- id -->, astatus:(open|active|done|cancelled), a falsifiabledone_when:(what you expect, and the symptom if it is NOT met), an optionalverify:shell command, afailure_modes:pre-mortem list, and- [ ]subtasks. done_whennames the evidence that distinguishes real success from a subtle failure.verify, when present, is the deterministic first stage of the sign-off check.- The agent ticks subtasks, appends to
## Log, and setsstatusas it works. Multiple goals may beactive.
The sign-off check (CompleteGoal)
CompleteGoal(goal_id, evidence, paths?) is the one blessed completion path:
- If the goal has a
verify:command, it is run. A non-zero exit rejects immediately, with no model call. - Otherwise a read-only
pisubprocess (the judge) inspects the evidence against the repo and the named failure modes and returns a verdict. It re-derives from the artifacts you point it at rather than trusting the claim, so pointevidence/pathsat durable artifacts (saved logs, committed diffs, files). - On accept, the goal's
statusflips todoneand a## Logline is written. On reject, the goal stays open and the agent is told what is missing.
The judge defaults to your current model (guaranteed authorized and capable). Set a different one
with /plan judge <provider/model> for an independent cross-family check.
Prompts
All model-facing text lives in src/prompts.ts, in flow order, so the process is
easy to review end to end.
Develop
pi -e ./src/index.ts # load locally
npm test # vitest: parser + sign-off record logic
npm run typecheck
npm run lint
Not (yet) included
No autonomous re-prompt loop (an until-done-style loop judge). Autonomy comes from the reminder, not a harness. Plan-phase model stickiness is a documented next step.
License
MIT