Files
pi-plan/docs/spec/2026-06-15_pi-goals.md
wassname 489f9b8c35 Clean pi-plan references, add judge timeout, fix heading format
- Rename spec doc to 2026-06-15_pi-goals.md, update title
- Update review.md spec reference
- Rename piPlanExtension -> piGoalsExtension in src/index.ts
- Add 120s timeout to judge subprocess (was unbounded, caused hang)
- Change planInjection heading from 'Goals (goals.md):' to '.pi/goals.md:'
- Add FIXMEs for tool label, progress visibility, heading format
2026-06-17 18:09:03 +08:00

18 KiB
Raw Permalink Blame History

pi-goals — design spec

Working title. A pi extension: set up goals (with subtasks and evidence) through plan mode, work them autonomously, and sign a goal off only when a check passes. One markdown file holds everything. The form guides a process; it does not police one. Successor to pi-lgtm, deliberately smaller.

Status: draft for review. Names, defaults, field shapes provisional.


1. Original ask → this spec

Ask Mechanism
Set up goals + subtasks + evidence via plan mode §3a — plan mode drafts the goal contract, you approve it
Subagent check of evidence on sign-off §5, §9 — oracle inside CompleteGoal
Goals shown in a task-list widget §7 — /plan renders goals + subtask checkboxes
Store all in plan.md §4 — single file, no sidecar store
A small manus-style append log §4 — short ## Log section inside plan.md
Typed reminders to update tasks §8a — recurring nudge
Work autonomously toward goals §3b, §8a — the loop, driven by the reminder
Persist through compaction; pi-tasks but simpler §8 injection; minimal tool surface

2. Decisions and preferences

Separates the opinionated forks from the mechanical body (§4 on).

2a. Preferences driving the design

  • Guidance over guardrails. None of the surveyed extensions hard-enforce. The form (plan.md structure) + the reminder + the prompts guide the agent through a process; the one genuinely rigorous step is the sign-off check; git + widget visibility is the backstop. The agent can edit anything — we make the right path the easy path, not the only path.
  • Anti-complexity. One file, minimal tools, plain-file editing for anything with no cheat incentive.
  • Reward-hacking / honesty focus. The sign-off check must resist assertion and test-gaming, not just check a box.
  • Cost-sensitivity (single 3090 / metered API). KV-cache hygiene, judge-once-per-goal, cheap loop judge.
  • Scout mindset. Make false completion visible rather than paper over it.

2b. Decisions

[decided] = settled; [open] = your call.

# Decision Alternative rejected Why Status
D1 Everything in one plan.md Separate .plan/log.jsonl sidecar Asked for; simpler, one diff to read decided
D2 Plan mode is the goal-setup-and-agreement phase Agent-only creation Approval is where done_when + failure_modes get agreed before any code decided
D3 Guide the process; don't gate it. The only special path is CompleteGoal (the sign-off check) Pre-tool-use interceptor that blocks status: done edits No surveyed extension enforces at that level; the reminder + form carry it; bypass is visible in git decided
D4 Two-stage sign-off check: deterministic verify: then oracle Oracle only; tests only (Codex) Tests unfakeable-by-assertion but gameable; oracle catches gaming + non-test criteria decided
D5 Two separate judges: cheap loop + oracle sign-off One judge for both Loop judge reads assertions (foolable, ok); sign-off judge reads artifacts decided
D6 Sign-off judge = oracle subprocess, copied not depended In-process; pi-subagents Shell-free spawn dodges noclobber/cropping; copying avoids flaky coupling decided
D7 Contract tamper-check = git visibility Append-only frozen log All-in-one-file gives up the hard freeze; git diff + guided sign-off are enough for a single user decided
D8 Completed goals archived, not deleted Auto-clear after idle A plan is a durable record decided
D9 Goals are flexible: multiple may be active One active goal forced Operator wants flexibility; the agent picks focus, injection lists the active set decided
D10 Loop judge default = main model, tiny prompt Dedicated cheap aux model Zero setup; switch if cost bites open
D11 Sign-off judge default = the session's current model Auto-pick "strongest on provider" (oracle-style) Current model is guaranteed authorized + capable; provider lists hold dead/weak/unauthorized entries. Cross-vendor is a setting (§9) decided
D12 Plan-phase model is selectable and sticky Always the working model Plan benefits from a stronger reasoner; persist the choice (oracle.json-style). Optionally the oracle drafts the plan (read-only + strong already) decided
D13 Offer to compact after plan accepted Always fresh session (burneikis); or never Some runs want a clean execution context, some want to keep it. Make it a post-Ready choice decided

2c. Cuts (non-goals)

DAG / blocks edges. Parallel subagent execution (the flaky part). findings.md. Hard pre-tool-use enforcement (D3). Sign-off judge every turn (cost).


3. Two phases: setup, then execution

3a. Setup — plan mode

Goals are created and agreed through plan mode (burneikis-style). Stock plan mode; the deltas are the output format and the hand-off.

  1. /plan <objective> enters plan mode. The agent explores read-only and drafts goals into plan.md in the contract format (§4). This phase runs on the plan-phase model (selectable + sticky, D12; optionally the read-only oracle drafts it).
  2. You review: Ready / Edit (NL rewrite) / $EDITOR (hand-edit) / Cancel. The agreement point — you sanity-check done_when and failure_modes before any code.
  3. On Ready, offer compact context? (y/n) (D13). Yes → execution starts in a cleared context with the approved plan.md re-injected. No → execution continues in the same context.

Direct plan.md edits remain a quick-add path for a one-off goal.

3b. Execution — the loop ↔ check cycle

Multiple goals may be active; the agent works whichever it's focused on, in the order it judges best.

  1. The session works an active goal under an iteration budget (or /goal (re)starts the loop on the current plan).
  2. Each turn, the loop judge reads the agent's last response → continue/pause (fail-open; the budget is the real backstop).
  3. When the agent judges a goal done, the reminder steers it to call CompleteGoal (not hand-tick status).
  4. CompleteGoal runs the two-stage check:
    • rejectmissing[] fed back; work continues toward the gap.
    • accept → goal marked done; the agent moves to another active/open goal, or the loop stops.

The loop judge can be fooled (reads assertions); worst case is a premature pause, caught by you or the budget. The sign-off check re-derives from artifacts, so it is not fooled cheaply. That asymmetry is the point.


4. The one file: plan.md

cwd root, git-tracked. Goals, subtasks, and a short log. The agent maintains all of it through its normal Edit tool — no separate store machinery.

# Plan: <one-line objective>

## Goal: Implement cache layer
<!-- id: cache-layer-1 -->
status: active
done_when: p95 < 50ms on bench-X. If wrong: timeouts in load-test.log
verify: pytest tests/cache -q && python bench/p95.py --max-ms 50
failure_modes:
  - cache silently bypassed (hit-rate ~0, latency ok by luck)
  - bench too small to exercise eviction
  - verify passes on a trivial/gamed test
- [x] wire cache client
- [ ] eviction policy
- [ ] load test

## Goal: ...

## Log
- 2026-06-15 14:02  cache client wired; eviction next
- 2026-06-15 14:31  eviction done; p95 bench reads 47ms (load-test.log)
- 2026-06-15 14:33  cache-layer-1 signed off (verify green, oracle accept)

Conventions:

  • Goals carry status: and no checkbox; subtasks are - [ ]. statusopen | active | done | cancelled. Multiple goals may be active (D9). Subtasks tick freely.
  • <!-- id --> assigned at creation; stable key (survives renaming the subject).
  • verify: (optional) is the deterministic stage-1 command.
  • failure_modes should name "verify could pass while still wrong" whenever a verify: exists.
  • ## Log is manus-style: append-only by convention, one short line per event. The reminder (§8a) enforces appending. Terse — "where it's up to" + error memory, not a transcript.

Parsing: a line scanner suffices for v0. mdast + remark-gfm only if it bites. Parse for reading; for the rare programmatic write (status flip, checkbox reconcile) use exact-line string patching, never a full AST serialize.


5. Tools

CompleteGoal is the one blessed path (it runs the check and records it). Everything else — create goal, edit plan, tick subtasks, append to log — is plain Edit, guided by the reminder.

CompleteGoal(id, evidence, paths[]) — the sign-off check

  1. Read done_when + verify + failure_modes for the goal from plan.md (git diff is the tamper-check, D7).
  2. Evidence must point to durable artifacts the read-only judge can inspect (saved logs, committed diffs, files). Ephemeral claims fail stage 2.
  3. Stage 1 — deterministic. If verify exists, run it shell-free, capture exit + output tail. Non-zero → reject immediately, return the tail. No model call spent.
  4. Stage 2 — oracle. Spawn the read-only judge (D11 default = current model; §9) with the criterion, failure modes, evidence, and verify result; it inspects the repo and checks the verify command was not gamed against the named failure modes.
  5. Verdict: accept → string-patch status: done, append a ## Log line. reject → status stays active, append missing[] to ## Log, return missing.

CancelGoal(id, reason) — optional

open/active → cancelled is not a sign-off, so it skips the check. A tool only to guarantee a ## Log line lands.


6. Guiding sign-off (no hard gate)

Per D3, there is no pre-tool-use interceptor blocking status: done. Sign-off is guided, not gated:

  • the reminder (§8a) tells the agent to complete a goal through CompleteGoal, not by hand-editing status;
  • CompleteGoal is the obvious, blessed path that runs the check and writes the log line;
  • the widget (§7) can flag a goal whose status: done has no corresponding ## Log sign-off line — visibility, not a block;
  • plan.md is git-tracked, so any hand-tick shows in the diff.

The agent can bypass it. The bet — borne out by how the other extensions actually run — is that a clear form plus a standing reminder makes the blessed path the path taken, and visibility catches the rare bypass.


7. Commands

  • /plan <desc>enter plan mode (§3a): read-only explore → draft goals → review. Ready offers the compact choice, then starts execution.
  • /plan (no args) — render the task-list widget: each goal with status + its subtask checkboxes + "N done hidden"; flag any done goal lacking a sign-off log line; offer archive-completed and cancel-goal.
  • /goal — (re)start the loop on the current plan.
  • /goal pause | resume | clear | status — loop controls.
  • /subgoal <text> — append an acceptance criterion to a goal mid-loop. Optional.
  • /judge model <ref> — set the sign-off judge model (default: current model; set a cross-vendor ref here for stronger independence, §9).

8. Hooks / lifecycle

  • before_agent_start — parse plan.md; inject a fixed-shape summary (active goals + focus + last log line) as a late user-role message. Compaction-persistence.
  • reminder — §8a.
  • pre-compact — flush state to plan.md before compaction.

(No pre-tool-use gate — D3.)

8a. The reminder (typed; what it says)

Fires when a goal is active and there have been N file-modifying turns since the last plan.md update. One <system-reminder> covering both task upkeep and goal progress:

  • task — tick completed subtask checkboxes; add new ones discovered.
  • log — append one short line to ## Log (append, don't rewrite).
  • goal — if a goal's evidence is in, sign it off via CompleteGoal — don't hand-tick status: done.
  • autonomy — keep working toward an active goal; don't stop to ask unless genuinely blocked.

Both the housekeeping and the autonomy engine, and — with no hard gate — the main thing making the process get followed. Keep the wording stable so it doesn't thrash the cache.


9. Judges

Loop judge Sign-off judge (stage 2)
Drives continue / pause each turn accept / reject a sign-off
Cost cheap, every turn costly, once per goal
Reads the agent's last response (~4 KB) the repo, independently
Transport one small model call (D10) read-only oracle subprocess
On failure fail-open → continue; budget is the backstop fail-closed → goal stays active
Foolable? yes — asserted "done" passes; bounded by budget hard: re-reads artifacts + runs verify

Sign-off judge: model choice (D11)

  • Default: the session's current model. Guaranteed authorized and capable, because you're already running it. Auto-picking "strongest on provider" (oracle-style) is rejected as the default — those lists carry dead, weak, and unauthorized entries.
  • Most of the value is model-independent. The read-only judge re-derives from artifacts: does the evidence match the repo, is the verify tautological, is each failure mode actually ruled out. Any capable model does that regardless of family.
  • Cross-vendor is the stronger-independence setting (/judge model), for the residual shared-reasoning-error class, when you have a known-good alternative. Mirror the oracle's curated provider list for that override menu; don't auto-select from it.

Transport (oracle pattern, copied)

  • Shell-free spawn. spawn(command, argsArray), no shell:true; capture stdout via pipe and parse. Why it avoids the noclobber/cropping pain of pi -p … > out.json under zsh. ~40 lines.
  • Read-only toolset. read / grep / find / ls, optional non-mutating bash. Separate process = fresh context, no anchoring — the independence you reliably get even from the same model.
  • Verdict contract. Oracle returns prose by default; impose VERDICT: accept|reject + missing: in the prompt and parse that block.

10. prompts.tsx

All model-facing text in one file, in flow order (drafted separately):

  1. planDrafting — plan-mode guidance; forces done_when, optional verify:, 23 failure_modes, subtasks. Human approves it.
  2. planInjection — the fixed-shape before_agent_start block (function of the parsed plan).
  3. reminder — the typed nudge (§8a).
  4. continuation — Hermes-style "keep going" user-role message.
  5. loopJudge — conservative, strict JSON {done, reason}.
  6. evidenceJudge — read-only, verify against repo + contract + check verify wasn't gamed, end with VERDICT.

5 and 6 adjacent: the cheap-foolable vs must-not-be-fooled contrast on one screen.


11. KV-cache hygiene

  • Inject as a late user-role message, never a system-prompt mutation (a long goal then costs the same as the same number of normal turns).
  • Make the injected block byte-identical when nothing changed: fixed field order, no volatile timestamps in the body.

12. Dependencies and what to copy

  • No hard dependency on pi-subagents or the oracle extension. Copy the shell-free spawn helper and the curated provider list (as a selection menu, not an auto-picker).
  • Markdown: line scanner first; mdast + remark-gfm only if needed.
  • Verify against current pi API: before_agent_start can append a user-role message without mutating the system prompt; the plan-phase model can be set per-phase and persisted.

13. Risks / open questions

  • Same-model sign-off judge → correlated blind spots (the D11 tradeoff). Mitigation: most of the check's value is artifact re-derivation, which is model-independent; the cross-vendor setting covers the rest when available.
  • No hard gate (D3) — the agent can hand-tick status: done and skip the check. Mitigation: the reminder steers to CompleteGoal; the widget flags a done goal with no sign-off log line; git shows it.
  • Contract tampering (D7) — editable plan.md means done_when/failure_modes can be softened pre-sign-off. Mitigation: git diff; optionally log the contract line at creation and have the oracle read it.
  • Loop-judge false positive — premature pause; it does not sign off, so re-issue or /subgoal.
  • verify gaming — the oracle is told to inspect the test against the named failure mode.
  • ## Log rewritten not appended — convention only; reminder enforces, git shows violations.
  • Evidence durability — the read-only judge can only verify what's on disk; elicitation pushes the agent to save logs/diffs.

14. Build order

Each step independently testable; model calls enter late.

  1. plan.md format + line parser (incl. <!-- id --> and ## Log) + /plan task-list widget. Pure file, no model calls.
  2. Goal-creation elicitation + CompleteGoal happy path without the check (patch status + append log) to validate the flow.
  3. Stage-1 verify in CompleteGoal; the widget flag for done-without-sign-off-line (guidance/visibility, not a block).
  4. Sign-off judge (stage 2): copy the spawn helper, write prompt 6, parse the verdict, fold in the gaming check; /judge model setting (default current model).
  5. before_agent_start injection (cache-safe) + the reminder (§8a).
  6. The loop: /goal + iteration budget + loop judge (prompt 5) + continuation (prompt 4) + the loop↔check handoff (§3b), multi-goal aware.
  7. Plan mode (§3a): /plan <desc> read-only draft → review → compact choice → hand-off. Plan-phase model selection + stickiness (D12). (Until built, create goals by direct plan.md edit.)
  8. Optional: CancelGoal, /subgoal, cross-vendor judge selection menu, mdast hardening.

prompts.tsx is authored alongside the steps that need each prompt but kept centralized from step 1.