Stop fetch retry from killing the polling loop

Two real bugs surfaced after the original retry helper: 1. Healthy long-polls were tripping our 15s per-attempt timeout. The getUpdates request asks Telegram for a 30s server-side long-poll, so our internal timeout aborted every healthy connection and turned it into ABORT_ERR -> retry -> exhausted -> "disconnected", with no auto-reconnect. The fix: long-poll bypasses the retry helper and uses a 60s per-attempt timeout, since the poll loop already retries by re-entering after sleep(). 2. Our own internal AbortController timeout produced a DOMException AbortError indistinguishable from a caller-abort. The poll loop's shouldStopTelegramPolling treated that as "user wants to stop" and exited. Now fetchWithRetry normalizes its own timeout into a tagged Error with code ATTEMPT_TIMEOUT, so only real caller-aborts surface as AbortError upstream. Also: per-attempt timeout default dropped 15s -> 5s, retry budget dropped from [500, 2000] to [500] (so 2 attempts, not 3) for outbound sends, since they serialize and a long retry tail makes the bridge feel hung. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-27 17:31:23 +08:00 · 2026-05-01 18:51:35 +08:00
parent 66898e02b4
commit b2444fd3cd
3 changed files with 59 additions and 19 deletions
@@ -54,6 +54,9 @@ export function shouldStopTelegramPolling(
  signalAborted: boolean,
  error: unknown,
 ): boolean {
+  // AbortError-from-our-own-timeout is normalized in fetchWithRetry so it
+  // can't reach here as a DOMException. Any AbortError that does reach here
+  // is therefore a real caller-abort and should stop the loop.
  return (
    signalAborted ||
    (error instanceof DOMException && error.name === "AbortError")