Show tool results in Telegram trace

2026-06-27 15:16:19 +08:00 · 2026-04-24 22:24:12 +08:00
parent 3ce880d5bc
commit 4677aea56b
8 changed files with 365 additions and 22 deletions
@@ -5,6 +5,7 @@
 - `[Runtime]` Fixed Telegram slash-command routing so `/stop`, `/status`, `/model`, and other local commands receive the real Telegram message and pi context instead of the wrong argument positions. Added stale-abort recovery so if pi is already idle but the bridge still thinks an aborted Telegram turn is active, the next Telegram message clears the stale local state and dispatch resumes. Impact: Telegram no longer gets permanently wedged while local `!` commands still work.
 - `[Command Menu]` `/start` now refreshes Telegram's bot command menu with bridge-local controls plus every Telegram-valid pi prompt, skill, and extension command from `pi.getCommands()`, including aliases such as `/p` when available. Invalid Bot API names are filtered and the menu is capped at Telegram's 100-command limit. Impact: Telegram's command picker better matches commands that work from the DM.
 - `[Trace + Shell Output]` Compact trace mode now marks shortened thinking/tool blocks explicitly with a “use /trace for full” notice, full mode keeps the complete final trace, and direct `!` shell replies are delivered through chunked Telegram-safe markdown instead of silently slicing off the tail. Impact: trace and shell output truncation is visible instead of hidden, and verbose output remains available.
+- `[Trace Results]` Telegram trace rendering now includes `toolResult` messages instead of only assistant tool-call blocks. Bash results render the actual output in code blocks and hydrate saved `fullOutputPath` files when present; unknown assistant blocks fall back to visible JSON in trace modes. Impact: `/trace full` no longer shows a bash call without the corresponding output.

 - `[Security]` Removed auto-pair-on-first-DM behavior. The bot now requires `allowedUserId` to be set before polling starts. Configure it via `TELEGRAM_ALLOWED_USER_ID` env var or the updated `/telegram-setup` prompt (which now asks for a numeric user ID after the bot token). The env var takes precedence over the saved config on every session start. Denied senders get an auth error reply; their numeric ID is also logged to the pi TUI as a warning. Breaking change: fresh installs require explicit configuration; existing installs with `allowedUserId` already in `telegram.json` continue to work unchanged.

@@ -206,7 +206,7 @@ The extension streams assistant text previews back to Telegram while pi is gener

 It tries Telegram draft streaming first with `sendMessageDraft`. If that is not supported for your bot, it falls back to `sendMessage` plus `editMessageText`.

-Compact trace mode marks shortened thinking/tool blocks explicitly instead of silently cropping them. Full trace mode keeps the complete final trace content.
+Compact trace mode marks shortened thinking/tool blocks explicitly instead of silently cropping them. Full trace mode keeps complete trace content, includes tool results, and uses a tool's saved full-output file when pi provides one.

 Direct `!` shell command replies are delivered in full across Telegram-safe chunks instead of being cut to the first screenful.

@@ -109,9 +109,9 @@ Telegram trace rendering uses three session-local display modes:

 - `text`: hide thinking and tool blocks
 - `compact`: show shortened thinking/tool blocks and mark any truncation explicitly with a “use /trace for full” notice
- `full`: show the complete final trace content
+- `full`: show complete final trace content, including tool results

-During streaming, trace blocks still appear as compact one-line summaries (e.g. `🧠 Thinking...`, `🔧 tool_name`). Final replies use the selected display mode through `/trace` and the status menu helpers.
+During streaming, trace blocks still appear as compact one-line summaries (e.g. `🧠 Thinking...`, `🔧 tool_name`). Final replies use the selected display mode through `/trace` and the status menu helpers. Tool-result messages are rendered as trace blocks instead of being dropped; bash output is shown in code blocks, and saved full-output files are read back for full trace mode when pi provides a `fullOutputPath`.

 ### Abort Recovery

@@ -11,6 +11,7 @@ Out: broader queue-policy redesign, non-Telegram pi abort semantics, and new tra
 - R1: A stale aborted Telegram turn cannot block later Telegram prompts forever. Done means: when local Telegram turn state survives after pi is already idle, the next Telegram message recovers the bridge and normal prompt dispatch resumes. VERIFY: targeted runtime regression shows `/stop`, no `agent_end`, then a later Telegram prompt dispatches into pi. If it silently failed, the test would still be stuck waiting for another dispatch.
 - R2: Direct `!` shell replies do not silently crop output. Done means: long stdout/stderr are delivered through chunked markdown replies instead of a hidden `slice(0, 3900)`. VERIFY: targeted regression inspects the emitted shell reply text and confirms the tail of long output is still present.
 - R3: Any compact trace truncation is explicit, and full trace mode stays untruncated. Done means: compact thinking/tool blocks include a visible “use /trace for full” notice when shortened, while full-mode tests still see the complete content. VERIFY: rendering tests assert the truncation notice in compact mode and the original long text in full mode.
+- R5: Telegram trace rendering does not silently drop tool results or unknown assistant blocks. Done means: `toolResult` messages become trace blocks, bash results show their output in code blocks, and unrecognized assistant blocks fall back to JSON instead of disappearing. VERIFY: rendering tests cover bash result formatting and unknown-block fallback; queue/runtime tests cover final tool-result delivery.
 - R4: User-facing docs describe the actual `/trace` behavior and the new recovery/truncation guarantees. Done means: README, architecture doc, and changelog all reflect the shipped behavior. VERIFY: grep/read shows aligned wording in all three docs.

 ## Tasks
@@ -35,6 +36,13 @@ Out: broader queue-policy redesign, non-Telegram pi abort semantics, and new tra
  - likely_fail: runtime changes land without doc updates, so the grep output is missing one of the files.
  - sneaky_fail: docs still describe `/trace` as a simple on/off toggle instead of text/compact/full.
  - UAT: "when I read the docs, they match what the Telegram bot actually does."
+- [x] T4 (R5): Preserve all trace blocks, including tool results.
+  - steps: normalize unknown assistant blocks to JSON fallback, extract `role: "toolResult"` messages into trace blocks, format bash tool results with visible output/details sections, and send tool-result blocks at agent end.
+  - verify: `node --experimental-strip-types --test tests/rendering.test.ts tests/queue.test.ts`
+  - success: tests show bash output text in full trace and tool-result delivery in Telegram runtime.
+  - likely_fail: only tool-call blocks are emitted; runtime test sees no "Tool result" Telegram message.
+  - sneaky_fail: unknown content blocks vanish; fallback test checks they render as JSON.
+  - UAT: "when a bash tool runs, Telegram trace full shows the command and actual output, not only the tool-call JSON."

 ## Context
 - The bridge keeps local queue and active-turn state separate from pi core state, so stale local state can wedge Telegram even when pi is already idle.
@@ -44,6 +52,8 @@ Out: broader queue-policy redesign, non-Telegram pi abort semantics, and new tra
 ## Log
 - Existing tests cover abort-plus-follow-up history, but they did not cover the stale-local-state path where pi is already idle and Telegram still thinks a turn is active.
 - The immediate `/stop` failure had a concrete routing bug too: slash commands were passed into `handleTelegramCommand()` with the wrong argument positions, so Telegram local commands could receive the wrong message/ctx objects while direct `!` shell commands still worked.
+- A later trace-full report showed that assistant tool-call blocks were rendered, but `role: "toolResult"` messages were not extracted into Telegram trace blocks, so the operator saw the bash JSON call without the command output.
+- `node --experimental-strip-types --test --test-force-exit tests/rendering.test.ts tests/queue.test.ts`: 69 tests passed after adding bash tool-result formatting, unknown-block fallback, and saved full-output hydration coverage.

 ## TODO
 - Consider exposing a clearer inline status indicator when the bridge auto-recovers a stale aborted turn.
@@ -681,6 +681,31 @@ export default function (pi: ExtensionAPI) {
    return encoded?.trim() || undefined;
  }

+  function stringifyTraceFallback(value: unknown): string | undefined {
+    if (value === undefined) return undefined;
+    if (typeof value === "string") return value.trim() || undefined;
+    try {
+      return JSON.stringify(value, null, 2)?.trim() || undefined;
+    } catch {
+      return String(value).trim() || undefined;
+    }
+  }
+
+  function extractTextBlocksContent(content: unknown): string {
+    if (typeof content === "string") return content.trim();
+    if (!Array.isArray(content)) return "";
+    return content
+      .map((block) => {
+        if (typeof block !== "object" || block === null) return "";
+        const candidate = block as Record<string, unknown>;
+        return candidate.type === "text" && typeof candidate.text === "string"
+          ? candidate.text
+          : "";
+      })
+      .join("")
+      .trim();
+  }
+
  function normalizeAssistantDisplayBlock(
    block: unknown,
  ): TelegramAssistantDisplayBlock | undefined {
@@ -723,9 +748,53 @@ export default function (pi: ExtensionAPI) {
        ),
      };
    }
+    const fallback = stringifyTraceFallback(candidate);
+    if (fallback) {
+      return {
+        type: "unknown",
+        label: String(candidate.type),
+        text: fallback,
+      };
+    }
    return undefined;
  }

+  function getToolResultFullOutputPath(details: unknown): string | undefined {
+    if (typeof details !== "object" || details === null) return undefined;
+    const path = (details as Record<string, unknown>).fullOutputPath;
+    return typeof path === "string" && path.trim() ? path : undefined;
+  }
+
+  async function readToolResultFullOutput(details: unknown): Promise<string | undefined> {
+    const path = getToolResultFullOutputPath(details);
+    if (!path) return undefined;
+    return readFile(path, "utf8").catch(() => undefined);
+  }
+
+  async function extractToolResultBlock(
+    message: Record<string, unknown>,
+  ): Promise<TelegramAssistantDisplayBlock | undefined> {
+    if (message.role !== "toolResult") return undefined;
+    const toolName =
+      typeof message.toolName === "string" ? message.toolName : undefined;
+    const text = extractTextBlocksContent(message.content);
+    const fullOutput = await readToolResultFullOutput(message.details);
+    const detailsText = stringifyTraceFallback(message.details);
+    const fallbackText = stringifyTraceFallback({
+      content: message.content,
+      details: message.details,
+    });
+    const outputText = fullOutput?.trimEnd() || text || fallbackText;
+    if (!outputText && !detailsText) return undefined;
+    return {
+      type: "tool_result",
+      toolName,
+      text: outputText ?? "",
+      detailsText,
+      isError: message.isError === true,
+    };
+  }
+
  function extractAssistantDisplayBlocks(
    content: unknown,
  ): TelegramAssistantDisplayBlock[] {
@@ -758,18 +827,23 @@ export default function (pi: ExtensionAPI) {
    );
  }

-  function extractAssistantTurn(messages: AgentMessage[]): {
+  async function extractAssistantTurn(messages: AgentMessage[]): Promise<{
    blocks: TelegramAssistantDisplayBlock[];
    text?: string;
    stopReason?: string;
    errorMessage?: string;
-  } {
+  }> {
    const blocks: TelegramAssistantDisplayBlock[] = [];
    let text: string | undefined;
    let stopReason: string | undefined;
    let errorMessage: string | undefined;
    for (const next of messages) {
      const message = next as unknown as Record<string, unknown>;
+      const toolResultBlock = await extractToolResultBlock(message);
+      if (toolResultBlock) {
+        blocks.push(toolResultBlock);
+        continue;
+      }
      if (message.role !== "assistant") continue;
      const nextBlocks = extractAssistantDisplayBlocks(message.content);
      blocks.push(...nextBlocks);
@@ -964,12 +1038,12 @@ export default function (pi: ExtensionAPI) {
    });
  }

-  function extractAssistantSummary(messages: AgentMessage[]): {
+  async function extractAssistantSummary(messages: AgentMessage[]): Promise<{
    blocks: TelegramAssistantDisplayBlock[];
    text?: string;
    stopReason?: string;
    errorMessage?: string;
-  } {
+  }> {
    return extractAssistantTurn(messages);
  }

@@ -2180,7 +2254,7 @@ export default function (pi: ExtensionAPI) {
      telegramTurnDispatchPending = false;
      updateStatus(ctx);
      const assistant = turn
-        ? extractAssistantSummary((event as { messages: AgentMessage[] }).messages)
+        ? await extractAssistantSummary((event as { messages: AgentMessage[] }).messages)
        : { blocks: [], text: undefined, stopReason: undefined, errorMessage: undefined };

      const endPlan = buildTelegramAgentEndPlan({
@@ -2208,8 +2282,13 @@ export default function (pi: ExtensionAPI) {
        return;
      }

-      // Flush any non-text blocks from the final message (single-message turns never trigger onMessageStart)
-      for (const block of pendingNonTextBlocks) {
+      // Flush tool results from the completed transcript plus any non-text blocks
+      // from the final assistant message (single-message turns never trigger onMessageStart).
+      const finalTraceBlocks = [
+        ...assistant.blocks.filter((block) => block.type === "tool_result"),
+        ...pendingNonTextBlocks,
+      ];
+      for (const block of finalTraceBlocks) {
        const msg = renderBlockMessage(block, displayMode);
        if (msg) void sendMarkdownReply(turn.chatId, turn.replyToMessageId, msg);
      }
@@ -11,7 +11,14 @@ export type TelegramAssistantDisplayBlock =
  | { type: "text"; text: string }
  | { type: "thinking"; text: string }
  | { type: "tool_call"; name: string; argsText?: string }
-  | { type: "tool_result"; text: string; toolName?: string };
+  | {
+      type: "tool_result";
+      text: string;
+      toolName?: string;
+      detailsText?: string;
+      isError?: boolean;
+    }
+  | { type: "unknown"; label: string; text: string };

 function truncateDisplayText(
  text: string,
@@ -45,6 +52,34 @@ function renderToolArgsMarkdown(argsText: string): string {
 const COMPACT_TRUNCATE = 500;
 const COMPACT_TRUNCATION_NOTICE = "[compact trace truncated; use /trace for full]";

+function renderMarkdownFenceBlock(language: string, text: string): string {
+  const fence = text.includes("```") && !text.includes("~~~") ? "~~~" : "```";
+  return `${fence}${language}\n${text}\n${fence}`;
+}
+
+function normalizeTraceOutputText(text: string): string {
+  return text
+    .replace(/\r\n/g, "\n")
+    .replace(/\r(?!\n)/g, "\n")
+    .replace(/\x1B\[[0-?]*[ -/]*[@-~]/g, "");
+}
+
+function renderTracedTextSection(
+  label: string,
+  text: string,
+  mode: DisplayMode,
+  language = "text",
+): string | undefined {
+  const normalized = normalizeTraceOutputText(text).trimEnd();
+  if (!normalized) return undefined;
+  const truncated =
+    mode === "compact"
+      ? truncateDisplayText(normalized, COMPACT_TRUNCATE)
+      : { text: normalized, truncated: false };
+  const notice = truncated.truncated ? `\n${COMPACT_TRUNCATION_NOTICE}` : "";
+  return `**${label}**\n${renderMarkdownFenceBlock(language, truncated.text)}${notice}`;
+}
+
 export function renderBlockMessage(
  block: TelegramAssistantDisplayBlock,
  mode: DisplayMode,
@@ -75,17 +110,29 @@ export function renderBlockMessage(

  if (block.type === "tool_result") {
    if (mode === "text") return undefined;
-    const trimmed = block.text.trim();
-    if (!trimmed) return undefined;
-    const truncated =
-      mode === "compact"
-        ? truncateDisplayText(trimmed, COMPACT_TRUNCATE)
-        : { text: trimmed, truncated: false };
-    const content = truncated.truncated
-      ? `${truncated.text}\n${COMPACT_TRUNCATION_NOTICE}`
-      : truncated.text;
-    const header = block.toolName ? `**Tool result** \`${block.toolName}\`` : "**Tool result**";
-    return `${header}\n${renderMarkdownQuote(content)}`;
+    const sections: string[] = [];
+    const header = block.toolName
+      ? `**Tool result** \`${block.toolName}\`${block.isError ? " (error)" : ""}`
+      : `**Tool result**${block.isError ? " (error)" : ""}`;
+    sections.push(header);
+    const output = renderTracedTextSection(
+      block.toolName === "bash" ? "output" : "result",
+      block.text,
+      mode,
+      "text",
+    );
+    if (output) sections.push(output);
+    const details = block.detailsText
+      ? renderTracedTextSection("details", block.detailsText, mode, "json")
+      : undefined;
+    if (details) sections.push(details);
+    return sections.length > 1 ? sections.join("\n\n") : undefined;
+  }
+
+  if (block.type === "unknown") {
+    if (mode === "text") return undefined;
+    const content = renderTracedTextSection("content", block.text, mode, "json");
+    return content ? `**Trace block** \`${block.label}\`\n\n${content}` : undefined;
  }
 }

@@ -1308,6 +1308,185 @@ test("Extension runtime finalizes a drafted preview into the final Telegram repl
  }
 });

+test("Extension runtime sends toolResult output blocks in Telegram trace", async () => {
+  const agentDir = join(homedir(), ".pi", "agent");
+  const configPath = join(agentDir, "telegram.json");
+  const previousConfig = await readFile(configPath, "utf8").catch(
+    () => undefined,
+  );
+  const handlers = new Map<
+    string,
+    (event: unknown, ctx: unknown) => Promise<unknown>
+  >();
+  const commands = new Map<
+    string,
+    { handler: (args: string, ctx: unknown) => Promise<void> }
+  >();
+  let resolveDispatch: (() => void) | undefined;
+  const dispatched = new Promise<void>((resolve) => {
+    resolveDispatch = resolve;
+  });
+  const sentTexts: string[] = [];
+  const pi = {
+    on: (
+      event: string,
+      handler: (event: unknown, ctx: unknown) => Promise<unknown>,
+    ) => {
+      handlers.set(event, handler);
+    },
+    registerCommand: (
+      name: string,
+      definition: { handler: (args: string, ctx: unknown) => Promise<void> },
+    ) => {
+      commands.set(name, definition);
+    },
+    registerTool: () => {},
+    sendUserMessage: () => {
+      resolveDispatch?.();
+    },
+    getThinkingLevel: () => "medium",
+  } as never;
+  const originalFetch = globalThis.fetch;
+  let getUpdatesCalls = 0;
+  const fullOutputPath = join("/tmp", "pi-telegram-full-output-test.txt");
+  globalThis.fetch = async (input, init) => {
+    const url = typeof input === "string" ? input : input.toString();
+    const method = url.split("/").at(-1) ?? "";
+    const body =
+      typeof init?.body === "string"
+        ? (JSON.parse(init.body) as Record<string, unknown>)
+        : undefined;
+    if (method === "deleteWebhook") {
+      return { json: async () => ({ ok: true, result: true }) } as Response;
+    }
+    if (method === "getUpdates") {
+      getUpdatesCalls += 1;
+      if (getUpdatesCalls === 1) {
+        return {
+          json: async () => ({
+            ok: true,
+            result: [
+              {
+                _: "other",
+                update_id: 1,
+                message: {
+                  message_id: 7,
+                  chat: { id: 99, type: "private" },
+                  from: { id: 77, is_bot: false, first_name: "Test" },
+                  text: "run status",
+                },
+              },
+            ],
+          }),
+        } as Response;
+      }
+      throw new DOMException("stop", "AbortError");
+    }
+    if (method === "sendMessageDraft") {
+      return { json: async () => ({ ok: true, result: true }) } as Response;
+    }
+    if (method === "sendMessage") {
+      sentTexts.push(String(body?.text ?? ""));
+      return {
+        json: async () => ({
+          ok: true,
+          result: { message_id: 100 + sentTexts.length },
+        }),
+      } as Response;
+    }
+    if (method === "sendChatAction") {
+      return { json: async () => ({ ok: true, result: true }) } as Response;
+    }
+    if (method === "editMessageText") {
+      return { json: async () => ({ ok: true, result: true }) } as Response;
+    }
+    throw new Error(`Unexpected Telegram API method: ${method}`);
+  };
+  try {
+    await mkdir(agentDir, { recursive: true });
+    await writeFile(
+      configPath,
+      JSON.stringify(
+        { botToken: "123:abc", allowedUserId: 77, lastUpdateId: 0 },
+        null,
+        "\t",
+      ) + "\n",
+      "utf8",
+    );
+    telegramExtension(pi);
+    const ctx = {
+      hasUI: true,
+      model: undefined,
+      signal: undefined,
+      ui: {
+        theme: {
+          fg: (_token: string, text: string) => text,
+        },
+        setStatus: () => {},
+        notify: () => {},
+      },
+      isIdle: () => true,
+      hasPendingMessages: () => false,
+      abort: () => {},
+      getContextUsage: () => undefined,
+    } as never;
+    await handlers.get("session_start")?.({}, ctx);
+    await commands.get("telegram-connect")?.handler("", ctx);
+    await dispatched;
+    await handlers.get("agent_start")?.({}, ctx);
+    await writeFile(fullOutputPath, "full output from saved file\nline 2", "utf8");
+    await handlers.get("agent_end")?.(
+      {
+        messages: [
+          {
+            role: "assistant",
+            content: [
+              {
+                type: "toolCall",
+                id: "call_1",
+                name: "bash",
+                arguments: { command: "printf 'visible output\\n'" },
+              },
+            ],
+          },
+          {
+            role: "toolResult",
+            toolCallId: "call_1",
+            toolName: "bash",
+            content: [{ type: "text", text: "tail only\nexit code: 0" }],
+            details: { fullOutputPath },
+            isError: false,
+          },
+          {
+            role: "assistant",
+            content: [{ type: "text", text: "Final answer" }],
+          },
+        ],
+      },
+      ctx,
+    );
+    assert.equal(
+      sentTexts.some(
+        (text) =>
+          text.includes("Tool result") &&
+          text.includes("bash") &&
+          text.includes("full output from saved file") &&
+          !text.includes("tail only"),
+      ),
+      true,
+    );
+    await handlers.get("session_shutdown")?.({}, ctx);
+  } finally {
+    globalThis.fetch = originalFetch;
+    await rm(fullOutputPath, { force: true });
+    if (previousConfig === undefined) {
+      await rm(configPath, { force: true });
+    } else {
+      await writeFile(configPath, previousConfig, "utf8");
+    }
+  }
+});
+
 test("Extension runtime carries queued follow-ups into history after an aborted turn", async () => {
  const agentDir = join(homedir(), ".pi", "agent");
  const configPath = join(agentDir, "telegram.json");
@@ -66,6 +66,21 @@ test("renderBlockMessage renders tool_result and hides it in text mode", () => {
  assert.ok(full.includes("file contents here"));
 });

+test("renderBlockMessage formats bash tool_result output as visible code", () => {
+  const block = {
+    type: "tool_result" as const,
+    text: "line 1\rline 2\nCommand exited with code 0",
+    toolName: "bash",
+    detailsText: '{\n  "fullOutputPath": "/tmp/full-output.txt"\n}',
+  };
+  const result = renderBlockMessage(block, "full")!;
+  assert.match(result, /\*\*Tool result\*\* `bash`/);
+  assert.match(result, /\*\*output\*\*/);
+  assert.match(result, /```text\nline 1\nline 2\nCommand exited with code 0\n```/);
+  assert.match(result, /\*\*details\*\*/);
+  assert.match(result, /fullOutputPath/);
+});
+
 test("renderBlockMessage truncates tool_result in compact mode", () => {
  const block = { type: "tool_result" as const, text: "x".repeat(600) };
  const compact = renderBlockMessage(block, "compact")!;
@@ -75,6 +90,18 @@ test("renderBlockMessage truncates tool_result in compact mode", () => {
  assert.ok(full.includes("x".repeat(600)));
 });

+test("renderBlockMessage renders unknown trace blocks instead of dropping them", () => {
+  const block = {
+    type: "unknown" as const,
+    label: "server_tool_result",
+    text: '{\n  "output": "visible fallback"\n}',
+  };
+  const result = renderBlockMessage(block, "full")!;
+  assert.match(result, /\*\*Trace block\*\* `server_tool_result`/);
+  assert.match(result, /visible fallback/);
+  assert.equal(renderBlockMessage(block, "text"), undefined);
+});
+
 test("renderBlockMessage marks truncated tool_call args in compact mode", () => {
  const block = {
    type: "tool_call" as const,