-
Notifications
You must be signed in to change notification settings - Fork 16
Description
✅ I checked the Altinity Stable Builds lifecycle table, and the Altinity Stable Build version I'm using is still supported.
Type of problem
Bug report - something's broken
Describe the situation
The ClickHouse server crashes (SIGABRT on debug/sanitizer builds) or returns a LOGICAL_ERROR (code 1001) on release builds when executing a query that combines correlated subqueries with EXISTS and QUALIFY clause while allow_experimental_correlated_subqueries is enabled.
The crash originates from an std::out_of_range exception on a vector access, triggered during query pipeline execution.
This issue:
- Causes SIGABRT (server crash) on debug and sanitizer builds (msan, tsan, asan, ubsan)
- Returns error code 1001 (
STD_EXCEPTION) on release builds — the server stays up - Is a pre-existing upstream bug — present in the upstream ClickHouse CI since at least July 2024 (version 24.7), with 51+ occurrences across versions 24.7, 24.9, 24.10, 24.11, 25.3, 25.8, 26.1, and 26.2
- Was detected in Altinity Antalya 25.8.16 via the AST fuzzer (amd_msan) CI job
- No existing GitHub issue was found upstream tracking this failure
How to reproduce the behavior
Environment
- Version: 25.8.16.20001.altinityantalya (also reproducible on upstream 24.7+)
- Build type: Any (crashes on debug/sanitizer, returns error on release)
Steps
- Create a simple test table:
CREATE TABLE test (i1 Int64, i2 Int64) ENGINE = MergeTree ORDER BY i1;
INSERT INTO test VALUES (1, 2), (3, 4), (5, 6);- Enable the required experimental settings:
SET allow_experimental_correlated_subqueries = 1;
SET allow_experimental_analyzer = 1;- Run the crashing query:
SELECT 1 FROM test AS t1 WHERE exists((
SELECT 76, 8, (t2.i2 = t1.i1) AND 13, *
FROM test AS t2 WHERE materialize(36) AND 13
)) QUALIFY 3;Expected behavior
The query should either return a valid result or a user-facing error (e.g., NOT_IMPLEMENTED, SYNTAX_ERROR). A LOGICAL_ERROR (code 1001) should never be reachable by any user query.
Actual behavior
On release builds
The server returns an error but stays up:
Received exception from server (version 25.8.16):
Code: 1001. DB::Exception: Received from localhost:9000. DB::Exception: std::out_of_range: vector. (STD_EXCEPTION)
On debug/sanitizer builds (CI)
The server crashes with SIGABRT:
2026.02.23 18:53:01.967492 [ 2567016 ] {3c1ed47d-62a2-4023-a2be-435d1b246812} <Fatal> : Logical error: 'std::exception. Code: 1001, type: std::out_of_range, e.what() = vector (version 25.8.16.20001.altinityantalya (altinity build)), Stack trace:
0. ./contrib/llvm-project/libcxx/include/__exception/exception.h:113: std::logic_error::logic_error(char const*) @ 0x000000005a22e123
1. std::out_of_range::out_of_range[abi:ne190107](char const*) @ 0x0000000009fa974e
2. std::__throw_out_of_range[abi:ne190107](char const*) @ 0x0000000009fa969c
3. ./contrib/llvm-project/libcxx/include/vector:1001: _GLOBAL__sub_I_StorageSystemDisks.cpp @ 0x000000002d3643d0
4. ./contrib/llvm-project/libcxx/include/vector:1458: DB::CrossJoinResult::next()::$_1::operator()(std::vector<COW<DB::IColumn>::immutable_ptr<DB::IColumn>, std::allocator<COW<DB::IColumn>::immutable_ptr<DB::IColumn>>> const&) const @ 0x000000003baf1f8e
5. ./ci/tmp/build/./src/Interpreters/HashJoin/HashJoin.cpp:989: DB::CrossJoinResult::next() @ 0x000000003baedee5
6. ./ci/tmp/build/./src/Processors/Transforms/JoiningTransform.cpp:224: DB::JoiningTransform::readExecute(DB::Chunk&) @ 0x000000004b69d058
7. ./ci/tmp/build/./src/Processors/Transforms/JoiningTransform.cpp:205: DB::JoiningTransform::transform(DB::Chunk&) @ 0x000000004b69a1b1
8. ./ci/tmp/build/./src/Processors/Transforms/JoiningTransform.cpp:136: DB::JoiningTransform::work() @ 0x000000004b698af6
9. ./ci/tmp/build/./src/Processors/Executors/ExecutionThreadContext.cpp:53: DB::ExecutionThreadContext::executeTask() @ 0x000000004ac8acac
10. ./ci/tmp/build/./src/Processors/Executors/PipelineExecutor.cpp:351: DB::PipelineExecutor::executeStepImpl(unsigned long, DB::IAcquiredSlot*, std::atomic<bool>*) @ 0x000000004ac4bb9c
11. ./ci/tmp/build/./src/Processors/Executors/PipelineExecutor.cpp:279: void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<DB::PipelineExecutor::spawnThreads(std::shared_ptr<DB::IAcquiredSlot>)::$_0, void ()>>(std::__function::__policy_storage const*) @ 0x000000004ac5087e
12. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x00000000253bbc10
13. ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*>(void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*&&)::'lambda'()::operator()() @ 0x00000000253d00b4
14. ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:149: void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<false, true>::ThreadFromGlobalPoolImpl<void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*>(void (ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool::*&&)(), ThreadPoolImpl<ThreadFromGlobalPoolImpl<false, true>>::ThreadFromThreadPool*&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x00000000253cfe5f
15. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x00000000253b4227
16. ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: void* std::__thread_proxy[abi:ne190107]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x00000000253c8264
17. start_thread @ 0x000000000009caa4
18. clone3 @ 0x0000000000129c6c
'.
2026.02.23 18:53:01.979782 [ 2567016 ] {3c1ed47d-62a2-4023-a2be-435d1b246812} <Fatal> : Stack trace (when copying this message, always include the lines below):
0. ./ci/tmp/build/./src/Common/StackTrace.cpp:389: StackTrace::StackTrace() @ 0x0000000025170827
1. ./ci/tmp/build/./src/Common/Exception.cpp:56: DB::abortOnFailedAssertion(String const&) @ 0x0000000024f5f8ad
2. ./ci/tmp/build/./src/Common/Exception.cpp:581: DB::getCurrentExceptionMessageAndPattern(bool, bool, bool) @ 0x0000000024f7615a
3. ./ci/tmp/build/./src/Common/Exception.cpp:520: DB::ExecutionStatus::fromCurrentException(String const&, bool) @ 0x0000000024f7939d
4. ./ci/tmp/build/./src/Processors/Executors/PipelineExecutor.cpp:153: DB::PipelineExecutor::execute(unsigned long, bool) @ 0x000000004ac494fd
5. ./ci/tmp/build/./src/Processors/Executors/PullingAsyncPipelineExecutor.cpp:76: void std::__function::__policy_invoker<void ()>::__call_impl[abi:ne190107]<std::__function::__default_alloc_func<ThreadFromGlobalPoolImpl<true, true>::ThreadFromGlobalPoolImpl<DB::PullingAsyncPipelineExecutor::pull(DB::Chunk&, unsigned long)::$_0>(DB::PullingAsyncPipelineExecutor::pull(DB::Chunk&, unsigned long)::$_0&&)::'lambda'(), void ()>>(std::__function::__policy_storage const*) @ 0x000000004aca76e6
6. ./contrib/llvm-project/libcxx/include/__functional/function.h:716: ? @ 0x00000000253b4227
7. ./contrib/llvm-project/libcxx/include/__type_traits/invoke.h:117: void* std::__thread_proxy[abi:ne190107]<std::tuple<std::unique_ptr<std::__thread_struct, std::default_delete<std::__thread_struct>>, void (ThreadPoolImpl<std::thread>::ThreadFromThreadPool::*)(), ThreadPoolImpl<std::thread>::ThreadFromThreadPool*>>(void*) @ 0x00000000253c8264
8. start_thread @ 0x000000000009caa4
9. clone3 @ 0x0000000000129c6c
Changed settings at crash time:
allow_experimental_correlated_subqueries = true
allow_experimental_analyzer = true
correlated_subqueries_substitute_equivalent_expressions = false
Upstream CI history
| Month | Upstream CI hits |
|---|---|
| 2024-07 | 1 |
| 2024-09 | 4 |
| 2024-10 | 5 |
| 2024-11 | 1 |
| 2025-03 | 36 (spike during PR ClickHouse#75942 "Join to Subquery") |
| 2025-07 | 1 |
| 2025-12 | 1 |
| 2026-02 | 2 |
Additional context
CI failure
- Job: AST fuzzer (amd_msan)
- Branch:
antalya-25.8 - Commit:
027f87165d1035cbd110054e534db3f0e7bb09b9 - CI report: ci_run_report.html
- Logs: json.html
Not a consistent failure
Out of 67 AST fuzzer runs on the antalya-25.8 branch, only 1 hit this crash (1.5% rate). All other fuzzer build types (debug, tsan, asan, ubsan) passed on the same commit.
Not caused by a backport
The bug first appeared upstream in July 2024 (version 24.7.1.1620) on PR ClickHouse#66346, predating the 25.8 branch. It is inherent to the experimental correlated subqueries feature in the shared codebase.