query: enforce result caps as server config instead of rewriting SQL#125
Open
BorisTyshkevich wants to merge 1 commit into
Open
query: enforce result caps as server config instead of rewriting SQL#125BorisTyshkevich wants to merge 1 commit into
BorisTyshkevich wants to merge 1 commit into
Conversation
…124) execute_query used to append " LIMIT N" when callers passed the limit argument. That broke DESCRIBE/SHOW/EXISTS (rewrites them to invalid SQL, since IsSelectQuery returns true for those) and solved the wrong problem — the real concern is DoS and model context exhaustion, which is an operator concern, not a per-call argument. Drop the limit argument. Add two server-config keys: clickhouse.max_result_rows (default 500, <0 disables) clickhouse.max_result_bytes (default 50000, <0 disables) The deprecated clickhouse.limit is kept as a silent alias for max_result_rows with a one-time startup warning. Caps apply to SELECT-like queries only (execute_query, dynamic read tools, OpenAPI variants) — write_query and dynamic write tools are unaffected. They are enforced in two layers: ClickHouse session settings pushed per-query (max_result_rows = configured+1, max_result_bytes, result_overflow_mode='break') so the engine stops early, plus a hard cap in executeSelect's row loop that guarantees an exact row count, bounds MCP-side memory, and works even when the CH user profile rejects settings changes. Discovery clients keep calling ExecuteQuery unchanged so they are not capped. When a cap fires, QueryResult.Truncated carries reason/limit/returned counts. MCP tool responses append a second TextContent block with human-readable guidance — "narrow your WHERE / aggregate / avoid SELECT *", not "use a smaller LIMIT", since a model that asked for 5000 rows and got 500 wants more, not less. OpenAPI responses set X-MCP-Truncated alongside the same body field. Closes #124 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #124.
Summary
limitargument fromexecute_queryand the OpenAPI?limit=query parameter — the rewrite-the-SQL approach brokeDESCRIBE/SHOW/EXISTSand solved the wrong problem.clickhouse.max_result_rows(default 500) andclickhouse.max_result_bytes(default 50000).0= use default, negative = disable (defer to ClickHouse user profile).clickhouse.limitas a silent alias formax_result_rows; log a one-time deprecation warning at startup when it's set.execute_query(MCP + OpenAPI) and read-mode dynamic tools (MCP + OpenAPI).write_queryand write-mode dynamic tools are unaffected. Discovery clients (getDiscoveryClient) keep calling plainExecuteQuery, so internal introspection queries stay uncapped.max_result_rows = configured+1,max_result_bytes,result_overflow_mode='break') pushed per-query viaclickhouse.Context(WithSettings)so the engine stops early; plus a hard cap insideexecuteSelect'srows.Next()loop for exact row counts, MCP-side OOM protection, and resilience toreadonly=1CH user profiles.QueryResult.TruncatedJSON field withreason/limit/returned_rows/returned_bytes_approx.TextContentblock recommending narrowing the query (tighterWHERE, aggregation, key-range pagination, narrowerSELECTlist — not "smaller LIMIT", since a model that wanted 5000 rows and got 500 wants more).X-MCP-Truncated: max_result_rows(ormax_result_bytes).docs/tools.mdwith the new keys, semantics, and signaling contract.Test plan
go test -count=1 ./...passes against embedded ClickHouse — all packages green.DESCRIBE/DESC/SHOW CREATE TABLE/EXISTS TABLE/EXPLAIN SELECTround-trip with caps configured (previously rejected as syntax errors).max_result_rows: 1, unboundedSELECTreturns 1 row,Truncated.Reason = "max_result_rows", second content block present.max_result_bytes: 64againstSELECT number, repeat('x', 100) ...yieldsTruncated.Reason = "max_result_bytes", notice mentionsSELECT *.-1): unboundedSELECT ... LIMIT 1000returns 1000 rows, no truncation, no notice block.write_query(INSERT/CREATE) unaffected — no settings injected, no truncation flag.X-MCP-Truncatedheader set + bodytruncatedfield populated.scripts/build-mcp-image.sh, deploy tootel, runtest-mcp-connectorsubagent (claude.ai only, cleanup always). Pass criteria:whoamireturns email;execute_query "SELECT number FROM system.numbers LIMIT 10000"returns the default 500 rows + truncation notice;execute_query "DESCRIBE system.tables"succeeds (regression).🤖 Generated with Claude Code