Fix RpcUtils.retry() silently swallowing timeout and max-retry exceptions, returning null#1878
Conversation
|
Welcome @eye-gu! It looks like this is your first PR to milvus-io/milvus-sdk-java 🎉 |
…ions, returning null Signed-off-by: eye-gu <734164350@qq.com>
6ba2aa0 to
9e74de9
Compare
There was a problem hiding this comment.
Pull request overview
Fixes issue #1877 where RpcUtils.retry() silently swallowed MilvusClientException thrown for timeout and max-retry exit paths, causing the method to return null instead of surfacing the failure. Also adds early-exit logic that detects when the next backoff sleep would push the total elapsed time past maxRetryTimeoutMs, avoiding unnecessary waits.
Changes:
- Change
timeoutCheckerfromCallable<Boolean>toFunction<Long, Boolean>so the caller no longer needs atry/catch (Exception ignored)wrapper that hid the timeout exception. - Remove the outer
try/catch (Exception ignored)around thek >= maxRetryTimesbranch so the terminalMilvusClientExceptionactually propagates; onlyThread.sleepis now wrapped. - Predict whether the next backoff sleep will exceed
maxRetryTimeoutMsand throw a TIMEOUT exception early, plus newRpcUtilsTestcovering timeout, max-retry, and slow-call scenarios.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| sdk-core/src/main/java/io/milvus/v2/utils/RpcUtils.java | Rework retry-loop exception handling so timeout/max-retry exceptions propagate; add pre-sleep timeout prediction. |
| sdk-core/src/test/java/io/milvus/v2/utils/RpcUtilsTest.java | New unit tests covering early backoff exit, max retry exhaustion, and slow-call timeout. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: eye-gu <734164350@qq.com>
|
@eye-gu |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: eye-gu, yhmo The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
close #1877
原来的代码:timeoutChecker 是 Callable,call() 声明了受检异常,调用方被迫 try-catch 包裹,catch 块 ignored 无差别吞掉了所有异常——包括超时时抛出的 MilvusClientException.
修改为 Function<Long, Boolean>,通过 timeoutChecker.apply(System.currentTimeMillis()) 调用,无需捕获受检异常,超时异常能正确向上传播
原来的代码:达到最大重试次数后抛出的 MilvusClientException 被外层 try { ... throw ... } catch (Exception ignored) {} 吞掉,导致循环正常结束并 return null。
修改为只对sleep进行catch
原来的代码:退避间隔直接 sleep,没有做任何预判。当指数退避乘数较大时,最后一次 sleep 可能使累计耗时远超 maxRetryTimeoutMs.减少无意义的等待