Skip to content

Commit 5284c4d

Browse files
committed
refactor: improve retry loggic
1 parent 0bb1f66 commit 5284c4d

File tree

4 files changed

+447
-380
lines changed

4 files changed

+447
-380
lines changed

README.md

Lines changed: 60 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -182,9 +182,17 @@ Example `config/tool_config.json`:
182182
export LOGGING="debug"
183183
```
184184
185-
- **`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`**: (Optional) Controls whether to automatically reconnect and retry on SSE tool call failures. Set to `"true"` to enable, `"false"` to disable. Default: `true`. See the "Enhanced Reliability Features" section for details.
185+
- **`RETRY_SSE_TOOL_CALL`**: (Optional) Controls whether to enable retries for SSE tool calls. Set to `"true"` to enable, `"false"` to disable. Default: `true`. See the "Enhanced Reliability Features" section for details.
186186
```bash
187-
export RETRY_SSE_TOOL_CALL_ON_DISCONNECT="true"
187+
export RETRY_SSE_TOOL_CALL="true"
188+
```
189+
- **`SSE_TOOL_CALL_MAX_RETRIES`**: (Optional) Maximum number of retry attempts for SSE tool calls (after the initial failure). Default: `2`. See the "Enhanced Reliability Features" section for details.
190+
```bash
191+
export SSE_TOOL_CALL_MAX_RETRIES="2"
192+
```
193+
- **`SSE_TOOL_CALL_RETRY_DELAY_BASE_MS`**: (Optional) Base delay in milliseconds for SSE tool call retries, used in exponential backoff. Default: `300`. See the "Enhanced Reliability Features" section for details.
194+
```bash
195+
export SSE_TOOL_CALL_RETRY_DELAY_BASE_MS="300"
188196
```
189197
- **`RETRY_HTTP_TOOL_CALL`**: (Optional) Controls whether to retry on HTTP tool call connection errors. Set to `"true"` to enable, `"false"` to disable. Default: `true`. See the "Enhanced Reliability Features" section for details.
190198
```bash
@@ -218,23 +226,33 @@ The MCP Proxy Server includes features to improve its resilience and the reliabi
218226
### 1. Error Propagation
219227
The proxy server ensures that errors originating from backend MCP services are consistently propagated to the requesting client. These errors are formatted as standard JSON-RPC error responses, making it easier for clients to handle them uniformly.
220228
221-
### 2. SSE Connection Retry for Tool Calls
222-
When a `tools/call` operation is made to an SSE-based backend server, and the underlying connection is lost or experiences an error, the proxy server will automatically attempt to:
223-
1. Re-establish the connection to the SSE backend.
224-
2. If reconnection is successful, it will retry the original `tools/call` request **once**.
229+
### 2. SSE Tool Call Retry
230+
When a `tools/call` operation is made to an SSE-based backend server, and the underlying connection is lost or experiences an error (including timeouts), the proxy server implements a retry mechanism.
225231
226-
This behavior helps mitigate transient network issues that might temporarily disrupt SSE connections.
232+
**Retry Mechanism:**
233+
If an initial SSE tool call fails due to a connection error or timeout, the proxy will attempt to re-establish the connection to the SSE backend. If reconnection is successful, it will then retry the original `tools/call` request using an exponential backoff strategy, similar to HTTP and Stdio retries. This means the delay before each subsequent retry attempt increases exponentially, with a small amount of jitter (randomness) added.
227234
228235
**Configuration:**
229-
This feature is primarily controlled by the **`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`** environment variable.
230-
- **`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`** (environment variable):
231-
- Set to `"true"` to enable the automatic reconnect and retry.
236+
These settings are primarily controlled by environment variables. Values in `config/mcp_server.json` under the `proxy` object for these specific keys will be overridden by environment variables if set.
237+
238+
- **`RETRY_SSE_TOOL_CALL`** (environment variable):
239+
- Set to `"true"` to enable retries for SSE tool calls.
232240
- Set to `"false"` to disable this feature.
233241
- **Default Behavior:** `true` (if the environment variable is not set, is empty, or is an invalid value).
234242
235-
**Example (Environment Variable):**
243+
- **`SSE_TOOL_CALL_MAX_RETRIES`** (environment variable):
244+
- Specifies the maximum number of retry attempts *after* the initial failed attempt. For example, if set to `"2"`, there will be one initial attempt and up to two retry attempts, totaling a maximum of three attempts.
245+
- **Default Behavior:** `2` (if the environment variable is not set, is empty, or is not a valid integer).
246+
247+
- **`SSE_TOOL_CALL_RETRY_DELAY_BASE_MS`** (environment variable):
248+
- The base delay in milliseconds used in the exponential backoff calculation. The delay before the *n*-th retry (0-indexed) is roughly `SSE_TOOL_CALL_RETRY_DELAY_BASE_MS * (2^n) + jitter`.
249+
- **Default Behavior:** `300` (milliseconds) (if the environment variable is not set, is empty, or is not a valid integer).
250+
251+
**Example (Environment Variables):**
236252
```bash
237-
export RETRY_SSE_TOOL_CALL_ON_DISCONNECT="true"
253+
export RETRY_SSE_TOOL_CALL="true"
254+
export SSE_TOOL_CALL_MAX_RETRIES="3"
255+
export SSE_TOOL_CALL_RETRY_DELAY_BASE_MS="500"
238256
```
239257
240258
### 3. HTTP Request Retry for Tool Calls
@@ -282,8 +300,8 @@ These settings are primarily controlled by environment variables.
282300
- **Default Behavior:** `300` (milliseconds) (if the environment variable is not set, is empty, or is not a valid integer).
283301
284302
**General Notes on Environment Variable Parsing:**
285-
- Boolean environment variables (`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`, `RETRY_HTTP_TOOL_CALL`, `RETRY_STDIO_TOOL_CALL`) are considered `true` if their lowercase value is exactly `"true"`. Any other value (including empty or not set) results in the default being applied or `false` if the default is `false` (though for these specific variables, the default is `true`).
286-
- Numeric environment variables (`HTTP_TOOL_CALL_MAX_RETRIES`, `HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS`, `STDIO_TOOL_CALL_MAX_RETRIES`, `STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS`) are parsed as base-10 integers. If parsing fails (e.g., the value is not a number, or the variable is empty/not set), the default value is used.
303+
- Boolean environment variables (`RETRY_SSE_TOOL_CALL`, `RETRY_HTTP_TOOL_CALL`, `RETRY_STDIO_TOOL_CALL`) are considered `true` if their lowercase value is exactly `"true"`. Any other value (including empty or not set) results in the default being applied or `false` if the default is `false` (though for these specific variables, the default is `true`)
304+
- Numeric environment variables (`SSE_TOOL_CALL_MAX_RETRIES`, `SSE_TOOL_CALL_RETRY_DELAY_BASE_MS`, `HTTP_TOOL_CALL_MAX_RETRIES`, `HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS`, `STDIO_TOOL_CALL_MAX_RETRIES`, `STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS`) are parsed as base-10 integers. If parsing fails (e.g., the value is not a number, or the variable is empty/not set), the default value is used
287305
288306
## Development
289307
@@ -336,16 +354,41 @@ These settings are primarily controlled by environment variables. Values in `con
336354
- **Default Behavior:** `300` (milliseconds) (if the environment variable is not set, is empty, or is not a valid integer).
337355
338356
**General Notes on Environment Variable Parsing:**
339-
- Boolean environment variables (`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`, `RETRY_HTTP_TOOL_CALL`) are considered `true` if their lowercase value is exactly `"true"`. Any other value (including empty or not set) results in the default being applied or `false` if the default is `false` (though for these specific variables, the default is `true`).
340-
- Numeric environment variables (`HTTP_TOOL_CALL_MAX_RETRIES`, `HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS`) are parsed as base-10 integers. If parsing fails (e.g., the value is not a number, or the variable is empty/not set), the default value is used.
357+
- Boolean environment variables (`RETRY_SSE_TOOL_CALL`, `RETRY_HTTP_TOOL_CALL`, `RETRY_STDIO_TOOL_CALL`) are considered `true` if their lowercase value is exactly `"true"`. Any other value (including empty or not set) results in the default being applied or `false` if the default is `false` (though for these specific variables, the default is `true`).
358+
- Numeric environment variables (`SSE_TOOL_CALL_MAX_RETRIES`, `SSE_TOOL_CALL_RETRY_DELAY_BASE_MS`, `HTTP_TOOL_CALL_MAX_RETRIES`, `HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS`, `STDIO_TOOL_CALL_MAX_RETRIES`, `STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS`) are parsed as base-10 integers. If parsing fails (e.g., the value is not a number, or the variable is empty/not set), the default value is used.
341359
342360
**Example (Environment Variables):**
343361
```bash
344362
export RETRY_HTTP_TOOL_CALL="true"
345363
export HTTP_TOOL_CALL_MAX_RETRIES="3"
346364
export HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS="500"
347365
```
348-
*(The JSON example for `mcp_server.json` under "Proxy Behavior Configuration" illustrates where other, non-environment-overrideable proxy settings might go.)*
366+
367+
### 4. Stdio Connection Retry for Tool Calls
368+
For `tools/call` operations directed to Stdio-based backend servers, the proxy implements a retry mechanism for connection errors (e.g., process crash or unresponsiveness).
369+
370+
**Retry Mechanism:**
371+
If an initial Stdio connection or tool call fails, the proxy will attempt to restart the Stdio process and retry the request. This mechanism follows an exponential backoff strategy similar to HTTP retries.
372+
373+
**Configuration:**
374+
These settings are primarily controlled by environment variables.
375+
376+
- **`RETRY_STDIO_TOOL_CALL`** (environment variable):
377+
- Set to `"true"` to enable Stdio tool call retries.
378+
- Set to `"false"` to disable this feature.
379+
- **Default Behavior:** `true` (if the environment variable is not set, is empty, or is an invalid value).
380+
381+
- **`STDIO_TOOL_CALL_MAX_RETRIES`** (environment variable):
382+
- Specifies the maximum number of retry attempts *after* the initial failed attempt. For example, if set to `"2"`, there will be one initial attempt and up to two retry attempts, totaling a maximum of three attempts.
383+
- **Default Behavior:** `2` (if the environment variable is not set, is empty, or is not a valid integer).
384+
385+
- **`STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS`** (environment variable):
386+
- The base delay in milliseconds used in the exponential backoff calculation. The delay before the *n*-th retry (0-indexed) is roughly `STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS * (2^n) + jitter`.
387+
- **Default Behavior:** `300` (milliseconds) (if the environment variable is not set, is empty, or is not a valid integer).
388+
389+
**General Notes on Environment Variable Parsing:**
390+
- Boolean environment variables (`RETRY_SSE_TOOL_CALL`, `RETRY_HTTP_TOOL_CALL`, `RETRY_STDIO_TOOL_CALL`) are considered `true` if their lowercase value is exactly `"true"`. Any other value (including empty or not set) results in the default being applied or `false` if the default is `false` (though for these specific variables, the default is `true`)。
391+
- Numeric environment variables (`SSE_TOOL_CALL_MAX_RETRIES`, `SSE_TOOL_CALL_RETRY_DELAY_BASE_MS`, `HTTP_TOOL_CALL_MAX_RETRIES`, `HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS`, `STDIO_TOOL_CALL_MAX_RETRIES`, `STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS`) are parsed as base-10 integers. If parsing fails (e.g., the value is not a number, or the variable is empty/not set), the default value is used.
349392
350393
## Development
351394

README_ZH.md

Lines changed: 30 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -183,9 +183,17 @@
183183
export LOGGING="debug"
184184
```
185185

186-
- **`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`**: (可选) 控制 SSE 工具调用失败时是否自动重连并重试。设置为 `"true"` 启用,`"false"` 禁用。默认: `true`。有关详细信息,请参阅“增强的可靠性特性”部分。
186+
- **`RETRY_SSE_TOOL_CALL`**: (可选) 控制 SSE 工具调用失败时是否自动重连并重试。设置为 `"true"` 启用,`"false"` 禁用。默认: `true`。有关详细信息,请参阅“增强的可靠性特性”部分。
187187
```bash
188-
export RETRY_SSE_TOOL_CALL_ON_DISCONNECT="true"
188+
export RETRY_SSE_TOOL_CALL="true"
189+
```
190+
- **`SSE_TOOL_CALL_MAX_RETRIES`**: (可选) SSE 工具调用最大重试次数(在初始失败后)。默认: `2`。有关详细信息,请参阅“增强的可靠性特性”部分。
191+
```bash
192+
export SSE_TOOL_CALL_MAX_RETRIES="2"
193+
```
194+
- **`SSE_TOOL_CALL_RETRY_DELAY_BASE_MS`**: (可选) SSE 工具调用重试延迟基准(毫秒),用于指数退避。默认: `300`。有关详细信息,请参阅“增强的可靠性特性”部分。
195+
```bash
196+
export SSE_TOOL_CALL_RETRY_DELAY_BASE_MS="300"
189197
```
190198
- **`RETRY_HTTP_TOOL_CALL`**: (可选) 控制 HTTP 工具调用连接错误时是否重试。设置为 `"true"` 启用,`"false"` 禁用。默认: `true`。有关详细信息,请参阅“增强的可靠性特性”部分。
191199
```bash
@@ -220,22 +228,32 @@ MCP 代理服务器包含多项特性,用以提升其自身弹性以及与后
220228
代理服务器确保从后端 MCP 服务产生的错误能够一致地传播给请求客户端。这些错误被格式化为标准的 JSON-RPC 错误响应,使客户端更容易统一处理它们。
221229

222230
### 2. SSE 工具调用的连接重试
223-
当对基于 SSE 的后端服务器执行 `tools/call` 操作时,如果底层连接丢失或遇到错误,代理服务器将自动尝试:
224-
1. 重新建立与 SSE 后端的连接。
225-
2. 如果重新连接成功,它将重试原始的 `tools/call` 请求**一次**
231+
当对基于 SSE 的后端服务器执行 `tools/call` 操作时,如果底层连接丢失或遇到错误(包括超时),代理服务器将实现重试机制。
226232

227-
此行为有助于缓解可能暂时中断 SSE 连接的瞬时网络问题。
233+
**重试机制:**
234+
如果初始 SSE 工具调用因连接错误或超时而失败,代理将尝试重新建立与 SSE 后端的连接。如果重新连接成功,它将使用指数退避策略重试原始的 `tools/call` 请求,类似于 HTTP 和 Stdio 重试。这意味着每次后续重试尝试之前的延迟会指数级增加,并加入少量抖动(随机性)。
228235

229236
**配置:**
230-
此功能主要通过 **`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`** 环境变量控制。
231-
- **`RETRY_SSE_TOOL_CALL_ON_DISCONNECT`** (环境变量):
232-
- 设置为 `"true"` 以启用自动重新连接和重试。
237+
这些设置主要通过环境变量控制。如果 `config/mcp_server.json``proxy` 对象下存在这些特定键的值,它们将被环境变量覆盖。
238+
239+
- **`RETRY_SSE_TOOL_CALL`** (环境变量):
240+
- 设置为 `"true"` 以启用 SSE 工具调用的重试。
233241
- 设置为 `"false"` 以禁用此功能。
234242
- **默认行为:** `true` (如果环境变量未设置、为空或为无效值)。
235243

244+
- **`SSE_TOOL_CALL_MAX_RETRIES`** (环境变量):
245+
- 指定在初次失败尝试*之后*的最大重试次数。例如,如果设置为 `"2"`,则会有一次初始尝试和最多两次重试尝试,总共最多三次尝试。
246+
- **默认行为:** `2` (如果环境变量未设置、为空或不是一个有效的整数)。
247+
248+
- **`SSE_TOOL_CALL_RETRY_DELAY_BASE_MS`** (环境变量):
249+
- 用于指数退避计算的基准延迟(以毫秒为单位)。第 *n* 次重试(0索引)之前的延迟大约是 `SSE_TOOL_CALL_RETRY_DELAY_BASE_MS * (2^n) + 抖动`
250+
- **默认行为:** `300` (毫秒) (如果环境变量未设置、为空或不是一个有效的整数)。
251+
236252
**示例 (环境变量):**
237253
```bash
238-
export RETRY_SSE_TOOL_CALL_ON_DISCONNECT="true"
254+
export RETRY_SSE_TOOL_CALL="true"
255+
export SSE_TOOL_CALL_MAX_RETRIES="3"
256+
export SSE_TOOL_CALL_RETRY_DELAY_BASE_MS="500"
239257
```
240258

241259
### 3. HTTP 工具调用的请求重试
@@ -283,8 +301,8 @@ export RETRY_SSE_TOOL_CALL_ON_DISCONNECT="true"
283301
- **默认行为:** `300` (毫秒) (如果环境变量未设置、为空或不是一个有效的整数)。
284302

285303
**环境变量解析通用说明:**
286-
- 布尔环境变量(`RETRY_SSE_TOOL_CALL_ON_DISCONNECT``RETRY_HTTP_TOOL_CALL``RETRY_STDIO_TOOL_CALL`)如果其小写值恰好是 `"true"`,则被视为 `true`。任何其他值(包括空或未设置)将应用默认值,或者如果默认值为 `false` 则为 `false`(尽管对于这些特定变量,默认值为 `true`)。
287-
- 数字环境变量(`HTTP_TOOL_CALL_MAX_RETRIES``HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS``STDIO_TOOL_CALL_MAX_RETRIES``STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS`)被解析为十进制整数。如果解析失败(例如,值不是数字,或变量为空/未设置),则使用默认值。
304+
- 布尔环境变量(`RETRY_SSE_TOOL_CALL``RETRY_HTTP_TOOL_CALL``RETRY_STDIO_TOOL_CALL`)如果其小写值恰好是 `"true"`,则被视为 `true`。任何其他值(包括空或未设置)将应用默认值,或者如果默认值为 `false` 则为 `false`(尽管对于这些特定变量,默认值为 `true`)。
305+
- 数字环境变量(`SSE_TOOL_CALL_MAX_RETRIES``SSE_TOOL_CALL_RETRY_DELAY_BASE_MS``HTTP_TOOL_CALL_MAX_RETRIES``HTTP_TOOL_CALL_RETRY_DELAY_BASE_MS``STDIO_TOOL_CALL_MAX_RETRIES``STDIO_TOOL_CALL_RETRY_DELAY_BASE_MS`)被解析为十进制整数。如果解析失败(例如,值不是数字,或变量为空/未设置),则使用默认值。
288306

289307
## 开发
290308

0 commit comments

Comments
 (0)