pi-goals/docs/spec/2026-06-15_pi-plan.md

# pi-plan — design spec

Working title. A pi extension: set up goals (with subtasks and evidence) through plan mode, work them autonomously, and sign a goal off only when a check passes. One markdown file holds everything. The form guides a process; it does not police one. Successor to `pi-lgtm`, deliberately smaller.

Status: draft for review. Names, defaults, field shapes provisional.

---

## 1. Original ask → this spec

| Ask | Mechanism |
|-----|-----------|
| Set up goals + subtasks + evidence via **plan mode** | §3a — plan mode drafts the goal contract, you approve it |
| **Subagent check** of evidence on sign-off | §5, §9 — oracle inside `CompleteGoal` |
| Goals shown in a **task-list widget** | §7 — `/plan` renders goals + subtask checkboxes |
| Store **all in `plan.md`** | §4 — single file, no sidecar store |
| A **small manus-style append log** | §4 — short `## Log` section inside `plan.md` |
| **Typed reminders** to update tasks | §8a — recurring nudge |
| **Work autonomously** toward goals | §3b, §8a — the loop, driven by the reminder |
| Persist through **compaction**; pi-tasks but simpler | §8 injection; minimal tool surface |

---

## 2. Decisions and preferences

Separates the opinionated forks from the mechanical body (§4 on).

### 2a. Preferences driving the design

- **Guidance over guardrails.** None of the surveyed extensions hard-enforce. The form (plan.md structure) + the reminder + the prompts guide the agent through a process; the one genuinely rigorous step is the sign-off check; git + widget visibility is the backstop. The agent can edit anything — we make the right path the easy path, not the only path.
- **Anti-complexity.** One file, minimal tools, plain-file editing for anything with no cheat incentive.
- **Reward-hacking / honesty focus.** The sign-off check must resist assertion and test-gaming, not just check a box.
- **Cost-sensitivity (single 3090 / metered API).** KV-cache hygiene, judge-once-per-goal, cheap loop judge.
- **Scout mindset.** Make false completion visible rather than paper over it.

### 2b. Decisions

`[decided]` = settled; `[open]` = your call.

| # | Decision | Alternative rejected | Why | Status |
|---|----------|----------------------|-----|--------|
| D1 | **Everything in one `plan.md`** | Separate `.plan/log.jsonl` sidecar | Asked for; simpler, one diff to read | decided |
| D2 | Plan mode **is** the goal-setup-and-agreement phase | Agent-only creation | Approval is where `done_when` + `failure_modes` get agreed before any code | decided |
| D3 | **Guide the process; don't gate it.** The only special path is `CompleteGoal` (the sign-off check) | Pre-tool-use interceptor that blocks `status: done` edits | No surveyed extension enforces at that level; the reminder + form carry it; bypass is visible in git | decided |
| D4 | **Two-stage sign-off check**: deterministic `verify:` then oracle | Oracle only; tests only (Codex) | Tests unfakeable-by-assertion but gameable; oracle catches gaming + non-test criteria | decided |
| D5 | Two **separate** judges: cheap loop + oracle sign-off | One judge for both | Loop judge reads assertions (foolable, ok); sign-off judge reads artifacts | decided |
| D6 | Sign-off judge = oracle subprocess, **copied not depended** | In-process; pi-subagents | Shell-free spawn dodges noclobber/cropping; copying avoids flaky coupling | decided |
| D7 | Contract tamper-check = **git visibility** | Append-only frozen log | All-in-one-file gives up the hard freeze; git diff + guided sign-off are enough for a single user | decided |
| D8 | Completed goals **archived, not deleted** | Auto-clear after idle | A plan is a durable record | decided |
| D9 | **Goals are flexible: multiple may be `active`** | One active goal forced | Operator wants flexibility; the agent picks focus, injection lists the active set | decided |
| D10 | Loop judge default = main model, tiny prompt | Dedicated cheap aux model | Zero setup; switch if cost bites | open |
| D11 | Sign-off judge default = **the session's current model** | Auto-pick "strongest on provider" (oracle-style) | Current model is guaranteed authorized + capable; provider lists hold dead/weak/unauthorized entries. Cross-vendor is a **setting** (§9) | decided |
| D12 | **Plan-phase model is selectable and sticky** | Always the working model | Plan benefits from a stronger reasoner; persist the choice (oracle.json-style). Optionally the oracle drafts the plan (read-only + strong already) | decided |
| D13 | **Offer to compact after plan accepted** | Always fresh session (burneikis); or never | Some runs want a clean execution context, some want to keep it. Make it a post-Ready choice | decided |

### 2c. Cuts (non-goals)

DAG / `blocks` edges. Parallel subagent execution (the flaky part). `findings.md`. Hard pre-tool-use enforcement (D3). Sign-off judge every turn (cost).

---

## 3. Two phases: setup, then execution

### 3a. Setup — plan mode

Goals are created and *agreed* through plan mode (burneikis-style). Stock plan mode; the deltas are the output format and the hand-off.

1. `/plan <objective>` enters plan mode. The agent explores read-only and drafts goals into `plan.md` in the contract format (§4). This phase runs on the **plan-phase model** (selectable + sticky, D12; optionally the read-only oracle drafts it).
2. You review: **Ready** / **Edit** (NL rewrite) / **$EDITOR** (hand-edit) / **Cancel**. The agreement point — you sanity-check `done_when` and `failure_modes` before any code.
3. On **Ready**, offer **compact context? (y/n)** (D13). Yes → execution starts in a cleared context with the approved `plan.md` re-injected. No → execution continues in the same context.

Direct `plan.md` edits remain a quick-add path for a one-off goal.

### 3b. Execution — the loop ↔ check cycle

Multiple goals may be `active`; the agent works whichever it's focused on, in the order it judges best.

1. The session works an `active` goal under an iteration budget (or `/goal` (re)starts the loop on the current plan).
2. Each turn, the **loop judge** reads the agent's last response → continue/pause (fail-open; the **budget is the real backstop**).
3. When the agent judges a goal done, the reminder steers it to call `CompleteGoal` (not hand-tick `status`).
4. `CompleteGoal` runs the **two-stage check**:
   - **reject** → `missing[]` fed back; work continues toward the gap.
   - **accept** → goal marked done; the agent moves to another active/open goal, or the loop stops.

The loop judge can be fooled (reads assertions); worst case is a premature pause, caught by you or the budget. The sign-off check re-derives from artifacts, so it is not fooled cheaply. That asymmetry is the point.

---

## 4. The one file: `plan.md`

cwd root, git-tracked. Goals, subtasks, and a short log. The agent maintains all of it through its normal Edit tool — no separate store machinery.

```markdown
# Plan: <one-line objective>

## Goal: Implement cache layer
<!-- id: cache-layer-1 -->
status: active
done_when: p95 < 50ms on bench-X. If wrong: timeouts in load-test.log
verify: pytest tests/cache -q && python bench/p95.py --max-ms 50
failure_modes:
  - cache silently bypassed (hit-rate ~0, latency ok by luck)
  - bench too small to exercise eviction
  - verify passes on a trivial/gamed test
- [x] wire cache client
- [ ] eviction policy
- [ ] load test

## Goal: ...

## Log
- 2026-06-15 14:02  cache client wired; eviction next
- 2026-06-15 14:31  eviction done; p95 bench reads 47ms (load-test.log)
- 2026-06-15 14:33  cache-layer-1 signed off (verify green, oracle accept)
```

Conventions:

- **Goals carry `status:` and no checkbox; subtasks are `- [ ]`.** `status` ∈ `open | active | done | cancelled`. Multiple goals may be `active` (D9). Subtasks tick freely.
- **`<!-- id -->`** assigned at creation; stable key (survives renaming the subject).
- **`verify:`** (optional) is the deterministic stage-1 command.
- **`failure_modes`** should name "verify could pass while still wrong" whenever a `verify:` exists.
- **`## Log`** is manus-style: append-only **by convention**, one short line per event. The reminder (§8a) enforces appending. Terse — "where it's up to" + error memory, not a transcript.

Parsing: a line scanner suffices for v0. `mdast` + `remark-gfm` only if it bites. Parse for *reading*; for the rare programmatic write (status flip, checkbox reconcile) use exact-line string patching, never a full AST serialize.

---

## 5. Tools

`CompleteGoal` is the one blessed path (it runs the check and records it). Everything else — create goal, edit plan, tick subtasks, append to log — is plain Edit, guided by the reminder.

### `CompleteGoal(id, evidence, paths[])` — the sign-off check

1. Read `done_when` + `verify` + `failure_modes` for the goal from `plan.md` (git diff is the tamper-check, D7).
2. **Evidence must point to durable artifacts** the read-only judge can inspect (saved logs, committed diffs, files). Ephemeral claims fail stage 2.
3. **Stage 1 — deterministic.** If `verify` exists, run it shell-free, capture exit + output tail. Non-zero → reject immediately, return the tail. No model call spent.
4. **Stage 2 — oracle.** Spawn the read-only judge (D11 default = current model; §9) with the criterion, failure modes, evidence, and verify result; it inspects the repo and checks the verify command was not gamed against the named failure modes.
5. Verdict: **accept** → string-patch `status: done`, append a `## Log` line. **reject** → status stays `active`, append `missing[]` to `## Log`, return `missing`.

### `CancelGoal(id, reason)` — optional

open/active → cancelled is not a sign-off, so it skips the check. A tool only to guarantee a `## Log` line lands.

---

## 6. Guiding sign-off (no hard gate)

Per D3, there is no pre-tool-use interceptor blocking `status: done`. Sign-off is guided, not gated:

- the **reminder** (§8a) tells the agent to complete a goal through `CompleteGoal`, not by hand-editing status;
- `CompleteGoal` is the obvious, blessed path that runs the check and writes the log line;
- the **widget** (§7) can flag a goal whose `status: done` has no corresponding `## Log` sign-off line — visibility, not a block;
- `plan.md` is git-tracked, so any hand-tick shows in the diff.

The agent *can* bypass it. The bet — borne out by how the other extensions actually run — is that a clear form plus a standing reminder makes the blessed path the path taken, and visibility catches the rare bypass.

---

## 7. Commands

- `/plan <desc>` — **enter plan mode** (§3a): read-only explore → draft goals → review. Ready offers the compact choice, then starts execution.
- `/plan` (no args) — render the **task-list widget**: each goal with status + its subtask checkboxes + "N done hidden"; flag any `done` goal lacking a sign-off log line; offer archive-completed and cancel-goal.
- `/goal` — (re)start the loop on the current plan.
- `/goal pause | resume | clear | status` — loop controls.
- `/subgoal <text>` — append an acceptance criterion to a goal mid-loop. Optional.
- `/judge model <ref>` — set the sign-off judge model (default: current model; set a cross-vendor ref here for stronger independence, §9).

---

## 8. Hooks / lifecycle

- **`before_agent_start`** — parse `plan.md`; inject a fixed-shape summary (active goals + focus + last log line) as a late **user-role** message. Compaction-persistence.
- **reminder** — §8a.
- **pre-compact** — flush state to `plan.md` before compaction.

(No pre-tool-use gate — D3.)

### 8a. The reminder (typed; what it says)

Fires when a goal is `active` and there have been **N file-modifying turns since the last `plan.md` update**. One `<system-reminder>` covering both task upkeep and goal progress:

- **task** — tick completed subtask checkboxes; add new ones discovered.
- **log** — append **one short line** to `## Log` (append, don't rewrite).
- **goal** — if a goal's evidence is in, **sign it off via `CompleteGoal`** — don't hand-tick `status: done`.
- **autonomy** — keep working toward an active goal; don't stop to ask unless genuinely blocked.

Both the housekeeping and the autonomy engine, and — with no hard gate — the main thing making the process get followed. Keep the wording stable so it doesn't thrash the cache.

---

## 9. Judges

| | Loop judge | Sign-off judge (stage 2) |
|---|---|---|
| Drives | continue / pause each turn | accept / reject a sign-off |
| Cost | cheap, every turn | costly, once per goal |
| Reads | the agent's last response (~4 KB) | the repo, independently |
| Transport | one small model call (D10) | read-only oracle subprocess |
| On failure | fail-open → continue; **budget** is the backstop | fail-closed → goal stays active |
| Foolable? | yes — asserted "done" passes; bounded by budget | hard: re-reads artifacts + runs `verify` |

### Sign-off judge: model choice (D11)

- **Default: the session's current model.** Guaranteed authorized and capable, because you're already running it. Auto-picking "strongest on provider" (oracle-style) is rejected as the default — those lists carry dead, weak, and unauthorized entries.
- **Most of the value is model-independent.** The read-only judge re-derives from artifacts: does the evidence match the repo, is the `verify` tautological, is each failure mode actually ruled out. Any capable model does that regardless of family.
- **Cross-vendor is the stronger-independence setting** (`/judge model`), for the residual *shared-reasoning-error* class, when you have a known-good alternative. Mirror the oracle's curated provider list for that override menu; don't auto-select from it.

### Transport (oracle pattern, copied)

- **Shell-free spawn.** `spawn(command, argsArray)`, no `shell:true`; capture stdout via pipe and parse. Why it avoids the noclobber/cropping pain of `pi -p … > out.json` under zsh. ~40 lines.
- **Read-only toolset.** `read / grep / find / ls`, optional non-mutating `bash`. Separate process = fresh context, no anchoring — the independence you reliably get even from the same model.
- **Verdict contract.** Oracle returns prose by default; impose `VERDICT: accept|reject` + `missing:` in the prompt and parse that block.

---

## 10. `prompts.tsx`

All model-facing text in one file, in flow order (drafted separately):

1. **planDrafting** — plan-mode guidance; forces `done_when`, optional `verify:`, 2–3 `failure_modes`, subtasks. Human approves it.
2. **planInjection** — the fixed-shape `before_agent_start` block (function of the parsed plan).
3. **reminder** — the typed nudge (§8a).
4. **continuation** — Hermes-style "keep going" user-role message.
5. **loopJudge** — conservative, strict JSON `{done, reason}`.
6. **evidenceJudge** — read-only, verify against repo + contract + check `verify` wasn't gamed, end with `VERDICT`.

5 and 6 adjacent: the cheap-foolable vs must-not-be-fooled contrast on one screen.

---

## 11. KV-cache hygiene

- Inject as a late **user-role** message, never a system-prompt mutation (a long goal then costs the same as the same number of normal turns).
- Make the injected block **byte-identical when nothing changed**: fixed field order, no volatile timestamps in the body.

---

## 12. Dependencies and what to copy

- **No hard dependency** on `pi-subagents` or the `oracle` extension. Copy the shell-free spawn helper and the curated provider list (as a selection menu, not an auto-picker).
- Markdown: line scanner first; `mdast` + `remark-gfm` only if needed.
- Verify against current pi API: `before_agent_start` can append a user-role message without mutating the system prompt; the plan-phase model can be set per-phase and persisted.

---

## 13. Risks / open questions

- **Same-model sign-off judge → correlated blind spots** (the D11 tradeoff). Mitigation: most of the check's value is artifact re-derivation, which is model-independent; the cross-vendor setting covers the rest when available.
- **No hard gate (D3)** — the agent can hand-tick `status: done` and skip the check. Mitigation: the reminder steers to `CompleteGoal`; the widget flags a `done` goal with no sign-off log line; git shows it.
- **Contract tampering (D7)** — editable `plan.md` means `done_when`/`failure_modes` can be softened pre-sign-off. Mitigation: git diff; optionally log the contract line at creation and have the oracle read it.
- **Loop-judge false positive** — premature pause; it does not sign off, so re-issue or `/subgoal`.
- **`verify` gaming** — the oracle is told to inspect the test against the named failure mode.
- **`## Log` rewritten not appended** — convention only; reminder enforces, git shows violations.
- **Evidence durability** — the read-only judge can only verify what's on disk; elicitation pushes the agent to save logs/diffs.

---

## 14. Build order

Each step independently testable; model calls enter late.

1. `plan.md` format + line parser (incl. `<!-- id -->` and `## Log`) + `/plan` task-list widget. Pure file, no model calls.
2. Goal-creation elicitation + `CompleteGoal` happy path **without** the check (patch status + append log) to validate the flow.
3. Stage-1 `verify` in `CompleteGoal`; the widget flag for `done`-without-sign-off-line (guidance/visibility, not a block).
4. Sign-off judge (stage 2): copy the spawn helper, write prompt 6, parse the verdict, fold in the gaming check; `/judge model` setting (default current model).
5. `before_agent_start` injection (cache-safe) + the reminder (§8a).
6. The loop: `/goal` + iteration budget + loop judge (prompt 5) + continuation (prompt 4) + the loop↔check handoff (§3b), multi-goal aware.
7. Plan mode (§3a): `/plan <desc>` read-only draft → review → compact choice → hand-off. Plan-phase model selection + stickiness (D12). (Until built, create goals by direct `plan.md` edit.)
8. Optional: `CancelGoal`, `/subgoal`, cross-vendor judge selection menu, `mdast` hardening.

`prompts.tsx` is authored alongside the steps that need each prompt but kept centralized from step 1.