# pi-goals — design spec Working title. A pi extension: set up goals (with subtasks and evidence) through plan mode, work them autonomously, and sign a goal off only when a check passes. One markdown file holds everything. The form guides a process; it does not police one. Successor to `pi-lgtm`, deliberately smaller. Status: draft for review. Names, defaults, field shapes provisional. --- ## 1. Original ask → this spec | Ask | Mechanism | |-----|-----------| | Set up goals + subtasks + evidence via **plan mode** | §3a — plan mode drafts the goal contract, you approve it | | **Subagent check** of evidence on sign-off | §5, §9 — oracle inside `CompleteGoal` | | Goals shown in a **task-list widget** | §7 — `/plan` renders goals + subtask checkboxes | | Store **all in `plan.md`** | §4 — single file, no sidecar store | | A **small manus-style append log** | §4 — short `## Log` section inside `plan.md` | | **Typed reminders** to update tasks | §8a — recurring nudge | | **Work autonomously** toward goals | §3b, §8a — the loop, driven by the reminder | | Persist through **compaction**; pi-tasks but simpler | §8 injection; minimal tool surface | --- ## 2. Decisions and preferences Separates the opinionated forks from the mechanical body (§4 on). ### 2a. Preferences driving the design - **Guidance over guardrails.** None of the surveyed extensions hard-enforce. The form (plan.md structure) + the reminder + the prompts guide the agent through a process; the one genuinely rigorous step is the sign-off check; git + widget visibility is the backstop. The agent can edit anything — we make the right path the easy path, not the only path. - **Anti-complexity.** One file, minimal tools, plain-file editing for anything with no cheat incentive. - **Reward-hacking / honesty focus.** The sign-off check must resist assertion and test-gaming, not just check a box. - **Cost-sensitivity (single 3090 / metered API).** KV-cache hygiene, judge-once-per-goal, cheap loop judge. - **Scout mindset.** Make false completion visible rather than paper over it. ### 2b. Decisions `[decided]` = settled; `[open]` = your call. | # | Decision | Alternative rejected | Why | Status | |---|----------|----------------------|-----|--------| | D1 | **Everything in one `plan.md`** | Separate `.plan/log.jsonl` sidecar | Asked for; simpler, one diff to read | decided | | D2 | Plan mode **is** the goal-setup-and-agreement phase | Agent-only creation | Approval is where `done_when` + `failure_modes` get agreed before any code | decided | | D3 | **Guide the process; don't gate it.** The only special path is `CompleteGoal` (the sign-off check) | Pre-tool-use interceptor that blocks `status: done` edits | No surveyed extension enforces at that level; the reminder + form carry it; bypass is visible in git | decided | | D4 | **Two-stage sign-off check**: deterministic `verify:` then oracle | Oracle only; tests only (Codex) | Tests unfakeable-by-assertion but gameable; oracle catches gaming + non-test criteria | decided | | D5 | Two **separate** judges: cheap loop + oracle sign-off | One judge for both | Loop judge reads assertions (foolable, ok); sign-off judge reads artifacts | decided | | D6 | Sign-off judge = oracle subprocess, **copied not depended** | In-process; pi-subagents | Shell-free spawn dodges noclobber/cropping; copying avoids flaky coupling | decided | | D7 | Contract tamper-check = **git visibility** | Append-only frozen log | All-in-one-file gives up the hard freeze; git diff + guided sign-off are enough for a single user | decided | | D8 | Completed goals **archived, not deleted** | Auto-clear after idle | A plan is a durable record | decided | | D9 | **Goals are flexible: multiple may be `active`** | One active goal forced | Operator wants flexibility; the agent picks focus, injection lists the active set | decided | | D10 | Loop judge default = main model, tiny prompt | Dedicated cheap aux model | Zero setup; switch if cost bites | open | | D11 | Sign-off judge default = **the session's current model** | Auto-pick "strongest on provider" (oracle-style) | Current model is guaranteed authorized + capable; provider lists hold dead/weak/unauthorized entries. Cross-vendor is a **setting** (§9) | decided | | D12 | **Plan-phase model is selectable and sticky** | Always the working model | Plan benefits from a stronger reasoner; persist the choice (oracle.json-style). Optionally the oracle drafts the plan (read-only + strong already) | decided | | D13 | **Offer to compact after plan accepted** | Always fresh session (burneikis); or never | Some runs want a clean execution context, some want to keep it. Make it a post-Ready choice | decided | ### 2c. Cuts (non-goals) DAG / `blocks` edges. Parallel subagent execution (the flaky part). `findings.md`. Hard pre-tool-use enforcement (D3). Sign-off judge every turn (cost). --- ## 3. Two phases: setup, then execution ### 3a. Setup — plan mode Goals are created and *agreed* through plan mode (burneikis-style). Stock plan mode; the deltas are the output format and the hand-off. 1. `/plan ` enters plan mode. The agent explores read-only and drafts goals into `plan.md` in the contract format (§4). This phase runs on the **plan-phase model** (selectable + sticky, D12; optionally the read-only oracle drafts it). 2. You review: **Ready** / **Edit** (NL rewrite) / **$EDITOR** (hand-edit) / **Cancel**. The agreement point — you sanity-check `done_when` and `failure_modes` before any code. 3. On **Ready**, offer **compact context? (y/n)** (D13). Yes → execution starts in a cleared context with the approved `plan.md` re-injected. No → execution continues in the same context. Direct `plan.md` edits remain a quick-add path for a one-off goal. ### 3b. Execution — the loop ↔ check cycle Multiple goals may be `active`; the agent works whichever it's focused on, in the order it judges best. 1. The session works an `active` goal under an iteration budget (or `/goal` (re)starts the loop on the current plan). 2. Each turn, the **loop judge** reads the agent's last response → continue/pause (fail-open; the **budget is the real backstop**). 3. When the agent judges a goal done, the reminder steers it to call `CompleteGoal` (not hand-tick `status`). 4. `CompleteGoal` runs the **two-stage check**: - **reject** → `missing[]` fed back; work continues toward the gap. - **accept** → goal marked done; the agent moves to another active/open goal, or the loop stops. The loop judge can be fooled (reads assertions); worst case is a premature pause, caught by you or the budget. The sign-off check re-derives from artifacts, so it is not fooled cheaply. That asymmetry is the point. --- ## 4. The one file: `plan.md` cwd root, git-tracked. Goals, subtasks, and a short log. The agent maintains all of it through its normal Edit tool — no separate store machinery. ```markdown # Plan: ## Goal: Implement cache layer status: active done_when: p95 < 50ms on bench-X. If wrong: timeouts in load-test.log verify: pytest tests/cache -q && python bench/p95.py --max-ms 50 failure_modes: - cache silently bypassed (hit-rate ~0, latency ok by luck) - bench too small to exercise eviction - verify passes on a trivial/gamed test - [x] wire cache client - [ ] eviction policy - [ ] load test ## Goal: ... ## Log - 2026-06-15 14:02 cache client wired; eviction next - 2026-06-15 14:31 eviction done; p95 bench reads 47ms (load-test.log) - 2026-06-15 14:33 cache-layer-1 signed off (verify green, oracle accept) ``` Conventions: - **Goals carry `status:` and no checkbox; subtasks are `- [ ]`.** `status` ∈ `open | active | done | cancelled`. Multiple goals may be `active` (D9). Subtasks tick freely. - **``** assigned at creation; stable key (survives renaming the subject). - **`verify:`** (optional) is the deterministic stage-1 command. - **`failure_modes`** should name "verify could pass while still wrong" whenever a `verify:` exists. - **`## Log`** is manus-style: append-only **by convention**, one short line per event. The reminder (§8a) enforces appending. Terse — "where it's up to" + error memory, not a transcript. Parsing: a line scanner suffices for v0. `mdast` + `remark-gfm` only if it bites. Parse for *reading*; for the rare programmatic write (status flip, checkbox reconcile) use exact-line string patching, never a full AST serialize. --- ## 5. Tools `CompleteGoal` is the one blessed path (it runs the check and records it). Everything else — create goal, edit plan, tick subtasks, append to log — is plain Edit, guided by the reminder. ### `CompleteGoal(id, evidence, paths[])` — the sign-off check 1. Read `done_when` + `verify` + `failure_modes` for the goal from `plan.md` (git diff is the tamper-check, D7). 2. **Evidence must point to durable artifacts** the read-only judge can inspect (saved logs, committed diffs, files). Ephemeral claims fail stage 2. 3. **Stage 1 — deterministic.** If `verify` exists, run it shell-free, capture exit + output tail. Non-zero → reject immediately, return the tail. No model call spent. 4. **Stage 2 — oracle.** Spawn the read-only judge (D11 default = current model; §9) with the criterion, failure modes, evidence, and verify result; it inspects the repo and checks the verify command was not gamed against the named failure modes. 5. Verdict: **accept** → string-patch `status: done`, append a `## Log` line. **reject** → status stays `active`, append `missing[]` to `## Log`, return `missing`. ### `CancelGoal(id, reason)` — optional open/active → cancelled is not a sign-off, so it skips the check. A tool only to guarantee a `## Log` line lands. --- ## 6. Guiding sign-off (no hard gate) Per D3, there is no pre-tool-use interceptor blocking `status: done`. Sign-off is guided, not gated: - the **reminder** (§8a) tells the agent to complete a goal through `CompleteGoal`, not by hand-editing status; - `CompleteGoal` is the obvious, blessed path that runs the check and writes the log line; - the **widget** (§7) can flag a goal whose `status: done` has no corresponding `## Log` sign-off line — visibility, not a block; - `plan.md` is git-tracked, so any hand-tick shows in the diff. The agent *can* bypass it. The bet — borne out by how the other extensions actually run — is that a clear form plus a standing reminder makes the blessed path the path taken, and visibility catches the rare bypass. --- ## 7. Commands - `/plan ` — **enter plan mode** (§3a): read-only explore → draft goals → review. Ready offers the compact choice, then starts execution. - `/plan` (no args) — render the **task-list widget**: each goal with status + its subtask checkboxes + "N done hidden"; flag any `done` goal lacking a sign-off log line; offer archive-completed and cancel-goal. - `/goal` — (re)start the loop on the current plan. - `/goal pause | resume | clear | status` — loop controls. - `/subgoal ` — append an acceptance criterion to a goal mid-loop. Optional. - `/judge model ` — set the sign-off judge model (default: current model; set a cross-vendor ref here for stronger independence, §9). --- ## 8. Hooks / lifecycle - **`before_agent_start`** — parse `plan.md`; inject a fixed-shape summary (active goals + focus + last log line) as a late **user-role** message. Compaction-persistence. - **reminder** — §8a. - **pre-compact** — flush state to `plan.md` before compaction. (No pre-tool-use gate — D3.) ### 8a. The reminder (typed; what it says) Fires when a goal is `active` and there have been **N file-modifying turns since the last `plan.md` update**. One `` covering both task upkeep and goal progress: - **task** — tick completed subtask checkboxes; add new ones discovered. - **log** — append **one short line** to `## Log` (append, don't rewrite). - **goal** — if a goal's evidence is in, **sign it off via `CompleteGoal`** — don't hand-tick `status: done`. - **autonomy** — keep working toward an active goal; don't stop to ask unless genuinely blocked. Both the housekeeping and the autonomy engine, and — with no hard gate — the main thing making the process get followed. Keep the wording stable so it doesn't thrash the cache. --- ## 9. Judges | | Loop judge | Sign-off judge (stage 2) | |---|---|---| | Drives | continue / pause each turn | accept / reject a sign-off | | Cost | cheap, every turn | costly, once per goal | | Reads | the agent's last response (~4 KB) | the repo, independently | | Transport | one small model call (D10) | read-only oracle subprocess | | On failure | fail-open → continue; **budget** is the backstop | fail-closed → goal stays active | | Foolable? | yes — asserted "done" passes; bounded by budget | hard: re-reads artifacts + runs `verify` | ### Sign-off judge: model choice (D11) - **Default: the session's current model.** Guaranteed authorized and capable, because you're already running it. Auto-picking "strongest on provider" (oracle-style) is rejected as the default — those lists carry dead, weak, and unauthorized entries. - **Most of the value is model-independent.** The read-only judge re-derives from artifacts: does the evidence match the repo, is the `verify` tautological, is each failure mode actually ruled out. Any capable model does that regardless of family. - **Cross-vendor is the stronger-independence setting** (`/judge model`), for the residual *shared-reasoning-error* class, when you have a known-good alternative. Mirror the oracle's curated provider list for that override menu; don't auto-select from it. ### Transport (oracle pattern, copied) - **Shell-free spawn.** `spawn(command, argsArray)`, no `shell:true`; capture stdout via pipe and parse. Why it avoids the noclobber/cropping pain of `pi -p … > out.json` under zsh. ~40 lines. - **Read-only toolset.** `read / grep / find / ls`, optional non-mutating `bash`. Separate process = fresh context, no anchoring — the independence you reliably get even from the same model. - **Verdict contract.** Oracle returns prose by default; impose `VERDICT: accept|reject` + `missing:` in the prompt and parse that block. --- ## 10. `prompts.tsx` All model-facing text in one file, in flow order (drafted separately): 1. **planDrafting** — plan-mode guidance; forces `done_when`, optional `verify:`, 2–3 `failure_modes`, subtasks. Human approves it. 2. **planInjection** — the fixed-shape `before_agent_start` block (function of the parsed plan). 3. **reminder** — the typed nudge (§8a). 4. **continuation** — Hermes-style "keep going" user-role message. 5. **loopJudge** — conservative, strict JSON `{done, reason}`. 6. **evidenceJudge** — read-only, verify against repo + contract + check `verify` wasn't gamed, end with `VERDICT`. 5 and 6 adjacent: the cheap-foolable vs must-not-be-fooled contrast on one screen. --- ## 11. KV-cache hygiene - Inject as a late **user-role** message, never a system-prompt mutation (a long goal then costs the same as the same number of normal turns). - Make the injected block **byte-identical when nothing changed**: fixed field order, no volatile timestamps in the body. --- ## 12. Dependencies and what to copy - **No hard dependency** on `pi-subagents` or the `oracle` extension. Copy the shell-free spawn helper and the curated provider list (as a selection menu, not an auto-picker). - Markdown: line scanner first; `mdast` + `remark-gfm` only if needed. - Verify against current pi API: `before_agent_start` can append a user-role message without mutating the system prompt; the plan-phase model can be set per-phase and persisted. --- ## 13. Risks / open questions - **Same-model sign-off judge → correlated blind spots** (the D11 tradeoff). Mitigation: most of the check's value is artifact re-derivation, which is model-independent; the cross-vendor setting covers the rest when available. - **No hard gate (D3)** — the agent can hand-tick `status: done` and skip the check. Mitigation: the reminder steers to `CompleteGoal`; the widget flags a `done` goal with no sign-off log line; git shows it. - **Contract tampering (D7)** — editable `plan.md` means `done_when`/`failure_modes` can be softened pre-sign-off. Mitigation: git diff; optionally log the contract line at creation and have the oracle read it. - **Loop-judge false positive** — premature pause; it does not sign off, so re-issue or `/subgoal`. - **`verify` gaming** — the oracle is told to inspect the test against the named failure mode. - **`## Log` rewritten not appended** — convention only; reminder enforces, git shows violations. - **Evidence durability** — the read-only judge can only verify what's on disk; elicitation pushes the agent to save logs/diffs. --- ## 14. Build order Each step independently testable; model calls enter late. 1. `plan.md` format + line parser (incl. `` and `## Log`) + `/plan` task-list widget. Pure file, no model calls. 2. Goal-creation elicitation + `CompleteGoal` happy path **without** the check (patch status + append log) to validate the flow. 3. Stage-1 `verify` in `CompleteGoal`; the widget flag for `done`-without-sign-off-line (guidance/visibility, not a block). 4. Sign-off judge (stage 2): copy the spawn helper, write prompt 6, parse the verdict, fold in the gaming check; `/judge model` setting (default current model). 5. `before_agent_start` injection (cache-safe) + the reminder (§8a). 6. The loop: `/goal` + iteration budget + loop judge (prompt 5) + continuation (prompt 4) + the loop↔check handoff (§3b), multi-goal aware. 7. Plan mode (§3a): `/plan ` read-only draft → review → compact choice → hand-off. Plan-phase model selection + stickiness (D12). (Until built, create goals by direct `plan.md` edit.) 8. Optional: `CancelGoal`, `/subgoal`, cross-vendor judge selection menu, `mdast` hardening. `prompts.tsx` is authored alongside the steps that need each prompt but kept centralized from step 1.