Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/guides/best-practices/sparse-primary-indexes.md
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,7 @@

On a self-managed ClickHouse cluster we can use the <a href="https://clickhouse.com/docs/sql-reference/table-functions/file/" target="_blank">file table function</a> for inspecting the content of the primary index of our example table.

For that we first need to copy the primary index file into the <a href="https://clickhouse.com/docs/operations/server-configuration-parameters/settings/#server_configuration_parameters-user_files_path" target="_blank">user_files_path</a> of a node from the running cluster:
For that we first need to copy the primary index file into the <a href="https://clickhouse.com/docs/operations/server-configuration-parameters/settings/#user_files_path" target="_blank">user_files_path</a> of a node from the running cluster:
<ul>
<li>Step 1: Get part-path that contains the primary index file</li>
`
Expand Down Expand Up @@ -504,7 +504,7 @@

The output for the ClickHouse client is now showing that instead of doing a full table scan, only 8.19 thousand rows were streamed into ClickHouse.

If <a href="https://clickhouse.com/docs/operations/server-configuration-parameters/settings/#server_configuration_parameters-logger" target="_blank">trace logging</a> is enabled then the ClickHouse server log file shows that ClickHouse was running a <a href="https://github.com/ClickHouse/ClickHouse/blob/22.3/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp#L1452" target="_blank">binary search</a> over the 1083 UserID index marks, in order to identify granules that possibly can contain rows with a UserID column value of `749927693`. This requires 19 steps with an average time complexity of `O(log2 n)`:
If <a href="https://clickhouse.com/docs/operations/server-configuration-parameters/settings/#logger" target="_blank">trace logging</a> is enabled then the ClickHouse server log file shows that ClickHouse was running a <a href="https://github.com/ClickHouse/ClickHouse/blob/22.3/src/Storages/MergeTree/MergeTreeDataSelectExecutor.cpp#L1452" target="_blank">binary search</a> over the 1083 UserID index marks, in order to identify granules that possibly can contain rows with a UserID column value of `749927693`. This requires 19 steps with an average time complexity of `O(log2 n)`:

Check notice on line 507 in docs/guides/best-practices/sparse-primary-indexes.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Wordy

Suggestion: Remove 'in order' and leave 'to'.
```response
...Executor): Key condition: (column 0 in [749927693, 749927693])
# highlight-next-line
Expand Down
2 changes: 1 addition & 1 deletion docs/guides/sre/tls/configuring-tls.md
Original file line number Diff line number Diff line change
Expand Up @@ -304,7 +304,7 @@ The settings below are configured in the ClickHouse server `config.xml`
</openSSL>
```

For more information, visit https://clickhouse.com/docs/operations/server-configuration-parameters/settings/#server_configuration_parameters-openssl
For more information, visit https://clickhouse.com/docs/operations/server-configuration-parameters/settings/#openssl

7. Configure TLS for gRPC on every node:
```xml
Expand Down
2 changes: 1 addition & 1 deletion docs/integrations/language-clients/rust.md
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ insert.end().await?;

* If `end()` isn't called, the `INSERT` is aborted.
* Rows are being sent progressively as a stream to spread the network load.
* ClickHouse inserts batches atomically only if all rows fit in the same partition and their number is less [`max_insert_block_size`](https://clickhouse.tech/docs/operations/settings/settings/#settings-max_insert_block_size).
* ClickHouse inserts batches atomically only if all rows fit in the same partition and their number is less [`max_insert_block_size`](https://clickhouse.tech/docs/operations/settings/settings/#max_insert_block_size).

### Async insert (server-side batching) {#async-insert-server-side-batching}

Expand Down
10 changes: 5 additions & 5 deletions knowledgebase/Insert_select_settings_tuning.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -19,28 +19,28 @@ How can I solve this?
Below are some of the settings to tune to avoid this error, this is expert level tuning of ClickHouse and these values should be set only after understanding the specifications of the ClickHouse cloud service or on-prem cluster where these will be used, so do not take these values as "one size fits all".


[max_insert_block_size](https://clickhouse.com/docs/operations/settings/settings#settings-max_insert_block_size) = `100_000_000` (default `1_048_576`)
[max_insert_block_size](https://clickhouse.com/docs/operations/settings/settings#max_insert_block_size) = `100_000_000` (default `1_048_576`)

Increase from ~1M to 100M would allow larger blocks to form

Note: This setting only applies when the server forms the blocks. i.e. INSERT via the HTTP interface, and not for clickhouse-client


[min_insert_block_size_rows](https://clickhouse.com/docs/operations/settings/settings#min-insert-block-size-rows) = `100_000_000` (default `1_048_576`)
[min_insert_block_size_rows](https://clickhouse.com/docs/operations/settings/settings#min_insert_block_size_rows) = `100_000_000` (default `1_048_576`)

Increase from ~1M to 100M would allow larger blocks to form.


[min_insert_block_size_bytes](https://clickhouse.com/docs/operations/settings/settings#min-insert-block-size-bytes) = `500_000_000` (default `268_435_456`)
[min_insert_block_size_bytes](https://clickhouse.com/docs/operations/settings/settings#min_insert_block_size_bytes) = `500_000_000` (default `268_435_456`)

Increase from 268.44 MB to 500 MB would allow larger blocks to form.


[parts_to_delay_insert](https://clickhouse.com/docs/operations/settings/merge-tree-settings#parts-to-delay-insert) = `500` (default `150`)
[parts_to_delay_insert](https://clickhouse.com/docs/operations/settings/merge-tree-settings#parts_to_delay_insert) = `500` (default `150`)

Increasing this so that INSERTs are not artificially slowed down when the number of active parts in a single partition is reached.


[parts_to_throw_insert](https://clickhouse.com/docs/operations/settings/merge-tree-settings#parts-to-throw-insert) = `1500` (default `3000`)
[parts_to_throw_insert](https://clickhouse.com/docs/operations/settings/merge-tree-settings#parts_to_throw_insert) = `1500` (default `3000`)

Increasing this would generally affect query performance to the table, but this would be fine for data migration.
2 changes: 1 addition & 1 deletion knowledgebase/async_vs_optimize_read_in_order.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ import optimize_read from "@site/static/images/knowledgebase/optimize_read.png";

The new setting allow_asynchronous_read_from_io_pool_for_merge_tree allows the number of reading threads (streams) to be higher than the number of threads in the rest of the query execution pipeline.

Normally the [max_threads](https://clickhouse.com/docs/operations/settings/settings/#settings-max_threads) setting [controls](https://clickhouse.com/company/events/query-performance-introspection) the number of parallel reading threads and parallel query processing threads:
Normally the [max_threads](https://clickhouse.com/docs/operations/settings/settings/#max_threads) setting [controls](https://clickhouse.com/company/events/query-performance-introspection) the number of parallel reading threads and parallel query processing threads:

<Image img={sync_read} size="md" alt="Synchronous data reading diagram" />

Expand Down
2 changes: 1 addition & 1 deletion knowledgebase/compare_resultsets.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ WITH
) AS q2_resultset_hash
SELECT equals(q1_resultset_hash,q2_resultset_hash) as Q1_equals_Q2
```
The example uses a [CTE](https://clickhouse.com/docs/sql-reference/statements/select/with) to calculate sums of the [cityHash](https://clickhouse.com/docs/sql-reference/functions/hash-functions#cityhash64) value of each row in these two queries and will return `1` if the two resultsets are identical.
The example uses a [CTE](https://clickhouse.com/docs/sql-reference/statements/select/with) to calculate sums of the [cityHash](https://clickhouse.com/docs/sql-reference/functions/hash-functions#cityHash64) value of each row in these two queries and will return `1` if the two resultsets are identical.


Using some integers sequence data and some pretty formatting:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ The response looks like:
```

:::note
If you do not have a `system.query_log` table, then you likely do not have query logging enabled. View the details of the [`query_log` setting](https://clickhouse.com/docs/operations/server-configuration-parameters/settings#server_configuration_parameters-query-log) for details on how to enable it.
If you do not have a `system.query_log` table, then you likely do not have query logging enabled. View the details of the [`query_log` setting](https://clickhouse.com/docs/operations/server-configuration-parameters/settings#query_log) for details on how to enable it.
:::

If you do not have a cluster, use can just query your one `system.query_log` table directly:
Expand Down
2 changes: 1 addition & 1 deletion knowledgebase/how-to-increase-thread-pool-size.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,4 @@ You can also free up resources if your server has a lot of idle threads - using
<max_thread_pool_free_size>2000</max_thread_pool_free_size>
```

Check out the [docs](https://clickhouse.com/docs/operations/server-configuration-parameters/settings#max-thread-pool-size) for more details on the settings above and other settings that affect the Global Thread pool.
Check out the [docs](https://clickhouse.com/docs/operations/server-configuration-parameters/settings#max_thread_pool_size) for more details on the settings above and other settings that affect the Global Thread pool.
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,7 @@ ARRAY JOIN features
```

It can be fixed by casting `multipolygon.properties.coordinates` to `Array(Array(Array(Tuple(Float64,Float64))))`.
To do so, we can use the function [arrayMap(func,arr1,...)](https://clickhouse.com/docs/sql-reference/functions/array-functions#arraymapfunc-arr1-).
To do so, we can use the function [arrayMap(func,arr1,...)](https://clickhouse.com/docs/sql-reference/functions/array-functions#arrayMap).

```sql
SELECT distinct
Expand Down
Loading