Stop fetch retry from killing the polling loop

Two real bugs surfaced after the original retry helper:

1. Healthy long-polls were tripping our 15s per-attempt timeout. The
   getUpdates request asks Telegram for a 30s server-side long-poll, so
   our internal timeout aborted every healthy connection and turned it
   into ABORT_ERR -> retry -> exhausted -> "disconnected", with no
   auto-reconnect. The fix: long-poll bypasses the retry helper and uses
   a 60s per-attempt timeout, since the poll loop already retries by
   re-entering after sleep().

2. Our own internal AbortController timeout produced a DOMException
   AbortError indistinguishable from a caller-abort. The poll loop's
   shouldStopTelegramPolling treated that as "user wants to stop" and
   exited. Now fetchWithRetry normalizes its own timeout into a tagged
   Error with code ATTEMPT_TIMEOUT, so only real caller-aborts surface
   as AbortError upstream.

Also: per-attempt timeout default dropped 15s -> 5s, retry budget
dropped from [500, 2000] to [500] (so 2 attempts, not 3) for outbound
sends, since they serialize and a long retry tail makes the bridge feel
hung.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
wassname
2026-05-01 18:51:35 +08:00
parent 66898e02b4
commit b2444fd3cd
3 changed files with 59 additions and 19 deletions
+3
View File
@@ -54,6 +54,9 @@ export function shouldStopTelegramPolling(
signalAborted: boolean,
error: unknown,
): boolean {
// AbortError-from-our-own-timeout is normalized in fetchWithRetry so it
// can't reach here as a DOMException. Any AbortError that does reach here
// is therefore a real caller-abort and should stop the loop.
return (
signalAborted ||
(error instanceof DOMException && error.name === "AbortError")