fix(core): spawn blocking ops onto worker pool to avoid stack overflow#7371
Merged
Xuanwo merged 2 commits intoapache:mainfrom Apr 10, 2026
Merged
fix(core): spawn blocking ops onto worker pool to avoid stack overflow#7371Xuanwo merged 2 commits intoapache:mainfrom
Xuanwo merged 2 commits intoapache:mainfrom
Conversation
Member
blocking::Operator previously called Handle::block_on() which polls the entire async state machine on the calling thread's stack. For backends with deep async call chains (e.g. HF/XET uploads going through retry layers, bridge_async, upload commits, and CAS streams), this can exceed the default 1 MB thread stack on Linux — causing SIGSEGV in the Java binding where JVM threads use this default. Replace direct block_on() calls with spawn() + block_on(JoinHandle) for the main I/O operations (stat, read, write, copy, rename, delete, list, create_dir). The async future now runs on tokio worker threads (which have adequate stack space) while the calling thread only waits on a lightweight JoinHandle. Closes apache#7367
d83bdb7 to
48cc3a4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
BlockingWriteTeston Linux x86_64 when using HF/XET backendRoot cause
blocking::OperatorcalledHandle::block_on()which polls the entire async state machine on the calling thread's stack. For backends with deep async call chains — HF/XET uploads go through:This exceeds the default 1 MB thread stack size on Linux, hitting a guard page →
SIGSEGV (SEGV_ACCERR).Why macOS wasn't affected: macOS default thread stack is 8 MB.
Why pure Rust tests passed: Rust's default thread stack is 8 MB.
Why the JVM crashed without an
hs_errfile: The overflow happened on the JVM's own thread before its crash handler could activate.Workaround
Increate java stack size:
Fix
Replace direct
Handle::block_on(future)calls withHandle::block_on(Handle::spawn(future))for the main I/O operations (stat, read, write, copy, rename, delete, list, create_dir).The async future now runs on tokio worker threads (which have adequate stack space — typically 2-8 MB) while the calling thread only waits on a lightweight
JoinHandle. This keeps the calling thread's stack usage minimal regardless of how deep the backend's async state machine goes.Operations that return streaming handles (reader, writer, lister, deleter) still use direct
block_onsince they only set up lightweight state — the actual I/O happens through separate blocking wrapper types that already manage their ownblock_oncalls.Verification
Tested in an
x86_64Docker container (Rosetta) with:-Xss1m)BlockingWriteTest— all 3 tests pass without any-XssworkaroundTest plan
BlockingWriteTestpasses on Linux x86_64 via Docker (previously SIGSEGV)cargo check -p opendal-core --features blockingcompiles cleanlyCloses #7367