mirror of
https://github.com/wassname/pi-plan.git
synced 2026-06-27 18:05:50 +08:00
861b2ea157
The drafting prompt over-decomposed: one goal per item, long run-on done_when (criterion + failure symptom in one line), and 3 mandatory failure_modes. Plans came out verbose and hard to read. - planDrafting: default to ONE goal; add another only for a genuinely separate checkpoint; near-identical items become subtasks. Subtasks only for 3+ step goals. Don't invent phases. (granularity heuristic adapted from tintinweb/pi-tasks when-to/when-not guidance) - done_when: one falsifiable check, no embedded "if wrong" clause (the failure symptom belongs in failure_modes) - failure_modes: 0-2 terse items, optional - Sync the stale done_when wording in README and plan-file.ts comment Co-Authored-By: Claudypoo <288921227+claudypoo@users.noreply.github.com>
114 lines
4.2 KiB
Markdown
114 lines
4.2 KiB
Markdown
# pi-plan
|
|
|
|
A [pi](https://github.com/badlogic/pi-mono) extension for plan-driven, goal-tracked work in one
|
|
`plan.md`. Set up goals (with evidence and failure modes) in plan mode, work them, and sign a goal
|
|
off only when a read-only subagent has checked the evidence.
|
|
|
|
Successor to [pi-lgtm](https://github.com/wassname/pi-lgtm), kept deliberately small: about
|
|
[burneikis/pi-plan](https://github.com/burneikis/pi-plan) plus the additions, goals with evidence,
|
|
a sign-off check, a widget, and a reminder.
|
|
|
|
The form guides; it does not gate. The agent edits `plan.md` with its normal Edit tool. The one
|
|
blessed tool is `CompleteGoal`, which runs the sign-off check and records the result. The reminder,
|
|
the injected plan summary, and git/widget visibility carry the process. It trusts the agent's
|
|
judgement rather than guarding it.
|
|
|
|
## Install
|
|
|
|
```bash
|
|
pi install npm:@wassname2/pi-plan
|
|
```
|
|
|
|
Or run without installing:
|
|
|
|
```bash
|
|
pi -e npm:@wassname2/pi-plan
|
|
```
|
|
|
|
## Use
|
|
|
|
```
|
|
/plan add CSV export to the report view
|
|
```
|
|
|
|
1. Plan. The agent explores read-only and writes goals into `plan.md` (see format below).
|
|
2. Review. You get a menu: Ready, Edit (ask the agent to revise), Open in `$EDITOR`, or Cancel.
|
|
On Ready you choose whether to keep the current context or start fresh and compacted.
|
|
3. Work. Each turn the active goal is injected (so it survives compaction) and a reminder nudges
|
|
the agent to keep `plan.md` current and work autonomously. When a goal's `done_when` is met the
|
|
agent calls `CompleteGoal`, which runs `verify` and a read-only judge and, on accept, marks it
|
|
done and logs it.
|
|
|
|
Other commands: `/plan` (print the plan), `/plan clear` (empty `plan.md`, history kept in git),
|
|
`/plan judge <model-ref>` (use a specific model for the sign-off judge; default is your current
|
|
model).
|
|
|
|
## plan.md format
|
|
|
|
One file holds the objective, the goals, and a short append-only log.
|
|
|
|
```markdown
|
|
# Plan: ship the cache layer
|
|
|
|
## Goal: Implement cache layer
|
|
<!-- id: cache-layer-1 -->
|
|
status: active
|
|
done_when: p95 < 50ms on bench-X. If wrong: timeouts in load-test.log
|
|
verify: pytest tests/cache -q && python bench/p95.py --max-ms 50
|
|
failure_modes:
|
|
- cache silently bypassed (hit-rate ~0, latency ok by luck)
|
|
- bench too small to exercise eviction
|
|
- [x] wire cache client
|
|
- [ ] eviction policy
|
|
|
|
## Log
|
|
- 2026-06-15 14:02 cache client wired; eviction next
|
|
```
|
|
|
|
- A goal is a `## Goal:` header with an `<!-- id -->`, a `status:`
|
|
(`open` | `active` | `done` | `cancelled`), one falsifiable `done_when:`, an optional `verify:`
|
|
shell command, an optional short `failure_modes:` pre-mortem list, and `- [ ]` subtasks.
|
|
- `done_when` names the evidence that distinguishes real success from a subtle failure. `verify`,
|
|
when present, is the deterministic first stage of the sign-off check.
|
|
- The agent ticks subtasks, appends to `## Log`, and sets `status` as it works. Multiple goals may
|
|
be `active`.
|
|
|
|
## The sign-off check (`CompleteGoal`)
|
|
|
|
`CompleteGoal(goal_id, evidence, paths?)` is the one blessed completion path:
|
|
|
|
1. If the goal has a `verify:` command, it is run. A non-zero exit rejects immediately, with no model
|
|
call.
|
|
2. Otherwise a read-only `pi` subprocess (the judge) inspects the evidence against the repo and the
|
|
named failure modes and returns a verdict. It re-derives from the artifacts you point it at
|
|
rather than trusting the claim, so point `evidence`/`paths` at durable artifacts (saved logs,
|
|
committed diffs, files).
|
|
3. On accept, the goal's `status` flips to `done` and a `## Log` line is written. On reject, the
|
|
goal stays open and the agent is told what is missing.
|
|
|
|
The judge defaults to your current model (guaranteed authorized and capable). Set a different one
|
|
with `/plan judge <provider/model>` for an independent cross-family check.
|
|
|
|
## Prompts
|
|
|
|
All model-facing text lives in [`src/prompts.ts`](src/prompts.ts), in flow order, so the process is
|
|
easy to review end to end.
|
|
|
|
## Develop
|
|
|
|
```bash
|
|
pi -e ./src/index.ts # load locally
|
|
npm test # vitest: parser + sign-off record logic
|
|
npm run typecheck
|
|
npm run lint
|
|
```
|
|
|
|
## Not (yet) included
|
|
|
|
No autonomous re-prompt loop (an until-done-style loop judge). Autonomy comes from the reminder, not
|
|
a harness. Plan-phase model stickiness is a documented next step.
|
|
|
|
## License
|
|
|
|
MIT
|