Skip to content

Fix RpcUtils.retry() silently swallowing timeout and max-retry exceptions, returning null#1878

Merged
sre-ci-robot merged 2 commits into
milvus-io:masterfrom
eye-gu:fix-1877-retry-swallow-exceptions
Jun 1, 2026
Merged

Fix RpcUtils.retry() silently swallowing timeout and max-retry exceptions, returning null#1878
sre-ci-robot merged 2 commits into
milvus-io:masterfrom
eye-gu:fix-1877-retry-swallow-exceptions

Conversation

@eye-gu
Copy link
Copy Markdown
Contributor

@eye-gu eye-gu commented May 29, 2026

close #1877

  1. timeoutChecker 类型从 Callable 改为 Function<Long, Boolean>
    原来的代码:timeoutChecker 是 Callable,call() 声明了受检异常,调用方被迫 try-catch 包裹,catch 块 ignored 无差别吞掉了所有异常——包括超时时抛出的 MilvusClientException.

修改为 Function<Long, Boolean>,通过 timeoutChecker.apply(System.currentTimeMillis()) 调用,无需捕获受检异常,超时异常能正确向上传播

  1. 移除包裹 k >= maxRetryTimes 退出逻辑的外层 try-catch
    原来的代码:达到最大重试次数后抛出的 MilvusClientException 被外层 try { ... throw ... } catch (Exception ignored) {} 吞掉,导致循环正常结束并 return null。

修改为只对sleep进行catch

  1. 退避 sleep 前预判是否会超时,提前退出
    原来的代码:退避间隔直接 sleep,没有做任何预判。当指数退避乘数较大时,最后一次 sleep 可能使累计耗时远超 maxRetryTimeoutMs.减少无意义的等待

@sre-ci-robot sre-ci-robot requested review from xiaofan-luan and yhmo May 29, 2026 11:46
@sre-ci-robot
Copy link
Copy Markdown

Welcome @eye-gu! It looks like this is your first PR to milvus-io/milvus-sdk-java 🎉

…ions, returning null

Signed-off-by: eye-gu <734164350@qq.com>
@eye-gu eye-gu force-pushed the fix-1877-retry-swallow-exceptions branch from 6ba2aa0 to 9e74de9 Compare May 29, 2026 11:53
@mergify mergify Bot added dco-passed and removed needs-dco labels May 29, 2026
@yhmo yhmo requested a review from Copilot June 1, 2026 02:27
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes issue #1877 where RpcUtils.retry() silently swallowed MilvusClientException thrown for timeout and max-retry exit paths, causing the method to return null instead of surfacing the failure. Also adds early-exit logic that detects when the next backoff sleep would push the total elapsed time past maxRetryTimeoutMs, avoiding unnecessary waits.

Changes:

  • Change timeoutChecker from Callable<Boolean> to Function<Long, Boolean> so the caller no longer needs a try/catch (Exception ignored) wrapper that hid the timeout exception.
  • Remove the outer try/catch (Exception ignored) around the k >= maxRetryTimes branch so the terminal MilvusClientException actually propagates; only Thread.sleep is now wrapped.
  • Predict whether the next backoff sleep will exceed maxRetryTimeoutMs and throw a TIMEOUT exception early, plus new RpcUtilsTest covering timeout, max-retry, and slow-call scenarios.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
sdk-core/src/main/java/io/milvus/v2/utils/RpcUtils.java Rework retry-loop exception handling so timeout/max-retry exceptions propagate; add pre-sleep timeout prediction.
sdk-core/src/test/java/io/milvus/v2/utils/RpcUtilsTest.java New unit tests covering early backoff exit, max retry exhaustion, and slow-call timeout.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread sdk-core/src/main/java/io/milvus/v2/utils/RpcUtils.java
Comment thread sdk-core/src/test/java/io/milvus/v2/utils/RpcUtilsTest.java Outdated
@mergify mergify Bot added the ci-passed label Jun 1, 2026
Signed-off-by: eye-gu <734164350@qq.com>
@mergify mergify Bot removed the ci-passed label Jun 1, 2026
@yhmo
Copy link
Copy Markdown
Contributor

yhmo commented Jun 1, 2026

@eye-gu
感谢修正了这个问题。

@yhmo
Copy link
Copy Markdown
Contributor

yhmo commented Jun 1, 2026

/lgtm
/approve

@sre-ci-robot
Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: eye-gu, yhmo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@mergify mergify Bot added the ci-passed label Jun 1, 2026
@sre-ci-robot sre-ci-robot merged commit 216c84e into milvus-io:master Jun 1, 2026
6 checks passed
sre-ci-robot pushed a commit that referenced this pull request Jun 1, 2026
…ions, returning null (#1878) (#1881)

* Fix RpcUtils.retry() silently swallowing timeout and max-retry exceptions, returning null



* fix InterruptedException



---------

Signed-off-by: eye-gu <734164350@qq.com>
Co-authored-by: eye-gu <734164350@qq.com>
sre-ci-robot pushed a commit that referenced this pull request Jun 1, 2026
…ions, returning null (#1878) (#1880)

* Fix RpcUtils.retry() silently swallowing timeout and max-retry exceptions, returning null



* fix InterruptedException



---------

Signed-off-by: eye-gu <734164350@qq.com>
Co-authored-by: eye-gu <734164350@qq.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RpcUtils.retry() 静默吞掉超时和最大重试次数异常,导致返回 null

4 participants