Skip to content
Merged
2 changes: 1 addition & 1 deletion docs/chdb/configuration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ config.set_log_format("verbose") # More details
config.enable_debug() # Sets DEBUG level + verbose format
```

See [Logging](logging.md) for details.
See [Logging](../debugging/logging.md) for details.

### Cache Configuration {#cache}

Expand Down
6 changes: 3 additions & 3 deletions docs/cloud/features/09_AI_ML/langfuse.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@

### Observability {#observability}

[Observability](/docs/observability/overview) is essential for understanding and debugging LLM applications. Unlike traditional software, LLM applications involve complex and non-deterministic interactions that can be challenging to monitor and debug. Langfuse provides comprehensive tracing capabilities that help you understand exactly what's happening in your application.
[Observability](https://langfuse.com/docs/observability/overview) is essential for understanding and debugging LLM applications. Unlike traditional software, LLM applications involve complex, non-deterministic interactions that can be challenging to monitor and debug. Langfuse provides comprehensive tracing capabilities that help you understand exactly what's happening in your application.

Check warning on line 77 in docs/cloud/features/09_AI_ML/langfuse.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.

Check warning on line 77 in docs/cloud/features/09_AI_ML/langfuse.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.

_📹 Want to learn more? [**Watch end-to-end walkthrough**](https://langfuse.com/watch-demo?tab=observability) of Langfuse Observability and how to integrate it with your application._

Expand Down Expand Up @@ -124,7 +124,7 @@
</Tabs>
### Prompt management {#prompt-management}

[Prompt Management](/docs/prompt-management/overview) is critical in building effective LLM applications. Langfuse provides tools to help you manage, version, and optimize your prompts throughout the development lifecycle.
[Prompt Management](https://langfuse.com/docs/prompt-management/overview) is critical in building effective LLM applications. Langfuse provides tools to help you manage, version, and optimize your prompts throughout the development lifecycle.

_📹 Want to learn more? [**Watch end-to-end walkthrough**](https://langfuse.com/watch-demo?tab=prompt) of Langfuse Prompt Management and how to integrate it with your application._

Expand Down Expand Up @@ -182,7 +182,7 @@

### Evaluation & datasets {#evaluation}

[Evaluation](/docs/evaluation/overview) is crucial for ensuring the quality and reliability of your LLM applications. Langfuse provides flexible evaluation tools that adapt to your specific needs, whether you're testing in development or monitoring production performance.
[Evaluation](https://langfuse.com/docs/evaluation/overview) is crucial for ensuring the quality and reliability of your LLM applications. Langfuse provides flexible evaluation tools that adapt to your specific needs, whether you're testing in development or monitoring production performance.

_📹 Want to learn more? [**Watch end-to-end walkthrough**](https://langfuse.com/watch-demo?tab=evaluation) of Langfuse Evaluation and how to use it to improve your LLM application._

Expand Down
1 change: 0 additions & 1 deletion docs/cloud/guides/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ keywords: ['cloud guides', 'documentation', 'how-to', 'cloud features', 'tutoria
| [Connect to ClickHouse](/cloud/reference/byoc/connect) | Connect to your BYOC ClickHouse services via public, private, or PrivateLink endpoints |
| [Connecting ClickHouse Cloud to Azure Blob Storage](/cloud/data-sources/secure-azure) | This article demonstrates how ClickHouse Cloud customers can access their Azure data securely |
| [Console audit log](/cloud/security/audit-logging/console-audit-log) | This page describes how you can review the cloud audit log |
| [Customized Setup](/cloud/reference/byoc/onboarding/customization) | Deploy ClickHouse on your own cloud infrastructure with a customized setup |
| [Data encryption](/cloud/security/cmek) | Learn more about data encryption in ClickHouse Cloud |
| [Data masking in ClickHouse](/cloud/guides/data-masking) | A guide to data masking in ClickHouse |
| [Database audit log](/cloud/security/audit-logging/database-audit-log) | This page describes how you can review the database audit log |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ With standard onboarding, you simply provide a dedicated AWS account/GCP project

Customers are strongly recommended to prepare a **dedicated** AWS account or GCP project for hosting the ClickHouse BYOC deployment to ensure better isolation in terms of permissions and resources. ClickHouse will deploy a dedicated set of cloud resources (VPC, Kubernetes cluster, IAM roles, S3 buckets, etc.) in your account.

If you need a more customized setup (for example, deploying into an existing VPC), refer to the [Customized Onboarding](/cloud/reference/byoc/onboarding/customization) documentation.
If you need a more customized setup (for example, deploying into an existing VPC), refer to the [Customized Onboarding](/cloud/reference/byoc/onboarding/customization-aws) documentation.

## Request access {#request-access}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -76,4 +76,4 @@ When available, this feature will allow you to:
* Maintain full control over role permissions and trust relationships
:::

For information about the IAM roles that ClickHouse Cloud creates by default, see the [BYOC Privilege Reference](/cloud/reference/byoc/reference/priviledge).
For information about the IAM roles that ClickHouse Cloud creates by default, see the [BYOC Privilege Reference](/cloud/reference/byoc/reference/privilege).
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ If you prefer to use an existing VPC to deploy ClickHouse BYOC instead of having

### Configure your existing VPC {#configure-existing-vpc}

1. Allocate at least 1 private subnet in a [region supported by ClickHouse BYOC](/cloud/reference/byoc/supported-regions) for the ClickHouse Kubernetes (GKE) cluster. Ensure the subnet has a minimum CIDR range of `/24` (e.g., 10.0.0.0/24) to provide sufficient IP addresses for GKE cluster nodes.
1. Allocate at least 1 private subnet in a [region supported by ClickHouse BYOC](/cloud/reference/supported-regions) for the ClickHouse Kubernetes (GKE) cluster. Ensure the subnet has a minimum CIDR range of `/24` (e.g., 10.0.0.0/24) to provide sufficient IP addresses for GKE cluster nodes.
2. Within the private subnet, allocate at least 1 secondary IPv4 range that will be used for GKE cluster pods. The secondary range should be at least `/23` to provide sufficient IP addresses for GKE cluster pods.
3. Enable **Private Google Access** on the subnet. This allows GKE nodes to reach Google APIs and services without requiring external IP addresses.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ FROM clusterAllReplicas('default',system.crash_log)

ClickHouse utilizes pre-created roles to enable system functions. This section assumes the customer is using AWS with CloudTrail and has access to the CloudTrail logs.

If an incident may be the result of a compromised role, review activities in CloudTrail and CloudWatch related to the ClickHouse IAM roles and actions. Refer to the [CloudFormation](/cloud/reference/byoc/reference/priviledge#cloudformation-iam-roles) stack or Terraform module provided as part of setup for a list of IAM roles.
If an incident may be the result of a compromised role, review activities in CloudTrail and CloudWatch related to the ClickHouse IAM roles and actions. Refer to the [CloudFormation](/cloud/reference/byoc/reference/privilege#cloudformation-iam-roles) stack or Terraform module provided as part of setup for a list of IAM roles.

## Unauthorized access to EKS cluster {#unauthorized-access-eks-cluster}

Expand Down
4 changes: 2 additions & 2 deletions docs/cloud/managed-postgres/benchmarks.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ The performance advantage comes from the fundamental architectural difference:
| **Network hops** | Zero (local device) | Every disk operation requires network round trip |
| **Performance scaling** | Scales linearly with concurrency | Limited by provisioned IOPS |

For more details on the performance benefits of NVMe storage, see [NVMe-powered performance](/cloud/managed-postgres/overview#nvme-performance).
For more details on the performance benefits of NVMe storage, see [NVMe-powered performance](/cloud/managed-postgres#nvme-performance).

## Cost-effectiveness {#cost-effectiveness}

Expand All @@ -184,5 +184,5 @@ The complete benchmark data, configurations, and detailed metrics are available

- [PeerDB: Comparing Postgres managed services](https://blog.peerdb.io/comparing-postgres-managed-services-aws-azure-gcp-and-supabase)
- [pgbench documentation](https://www.postgresql.org/docs/current/pgbench.html)
- [Managed Postgres overview](/cloud/managed-postgres/overview)
- [Managed Postgres overview](/cloud/managed-postgres)
- [Scaling your Postgres instance](/cloud/managed-postgres/scaling)
1 change: 0 additions & 1 deletion docs/cloud/managed-postgres/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -140,4 +140,3 @@ Automatic storage scaling is on the roadmap for Managed Postgres. This feature w
- [Settings and configuration](/cloud/managed-postgres/settings)
- [Read replicas](/cloud/managed-postgres/read-replicas)
- [High availability](/cloud/managed-postgres/high-availability)
- [Performance benchmarks](/cloud/managed-postgres/benchmarks)
2 changes: 1 addition & 1 deletion docs/cloud/reference/03_billing/05_payment-thresholds.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ If you are a pay as you go customer and your amount due in a billing period for

:::tip
This default payment threshold amount can be adjusted below $10,000.
If you wish to do so, [contact support](support@clickhouse.com).
If you wish to do so, [contact support](mailto:support@clickhouse.com).
:::

A failed charge will result in the suspension of your services after a 14 day grace period.
Expand Down
2 changes: 1 addition & 1 deletion docs/dictionary/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -323,7 +323,7 @@ The `LAYOUT` clause controls the internal data structure for the dictionary. A n
We have specified a `LIFETIME` for the dictionary of `MIN 600 MAX 900`. LIFETIME is the update interval for the dictionary, with the values here causing a periodic reload at a random interval between 600 and 900s. This random interval is necessary in order to distribute the load on the dictionary source when updating on a large number of servers. During updates, the old version of a dictionary can still be queried, with only the initial load blocking queries. Note that setting `(LIFETIME(0))` prevents dictionaries from updating.
Dictionaries can be forcibly reloaded using the `SYSTEM RELOAD DICTIONARY` command.

For database sources such as ClickHouse and Postgres, you can set up a query that will update the dictionaries only if they really changed (the response of the query determines this), rather than at a periodic interval. Further details can be found [here](/sql-reference/statements/create/dictionary/lifetime#refreshing-dictionary-data-using-lifetime).
For database sources such as ClickHouse and Postgres, you can set up a query that will update the dictionaries only if they really changed (the response of the query determines this), rather than at a periodic interval. Further details can be found [here](/sql-reference/statements/create/dictionary/lifetime).

### Other dictionary types {#other-dictionary-types}

Expand Down
2 changes: 1 addition & 1 deletion docs/getting-started/example-datasets/cell-towers.md
Original file line number Diff line number Diff line change
Expand Up @@ -166,7 +166,7 @@ SELECT mcc, count() FROM cell_towers GROUP BY mcc ORDER BY count() DESC LIMIT 10

Based on the above query and the [MCC list](https://en.wikipedia.org/wiki/Mobile_country_code), the countries with the most cell towers are: the USA, Germany, and Russia.

You may want to create a [Dictionary](../../sql-reference/statements/create/dictionary/index.md) in ClickHouse to decode these values.
You may want to create a [Dictionary](/sql-reference/statements/create/dictionary) in ClickHouse to decode these values.

## Use case: incorporate geo data {#use-case}

Expand Down
4 changes: 2 additions & 2 deletions docs/guides/best-practices/skipping-indexes-examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ EXPLAIN indexes = 1
SELECT count() FROM logs WHERE msg LIKE '%timeout%';
```

[This guide](/use-cases/observability/schema-design#bloom-filters-for-text-search) shows practical examples and when to use token vs ngram.
[This guide](/use-cases/observability/schema-design#text-index-for-full-text-search) shows practical examples and when to use token vs ngram.

**Parameter optimization helpers:**

Expand Down Expand Up @@ -171,7 +171,7 @@ EXPLAIN indexes = 1
SELECT count() FROM logs WHERE hasToken(lower(msg), 'exception');
```

See observability examples and guidance on token vs ngram [here](/use-cases/observability/schema-design#bloom-filters-for-text-search).
See observability examples and guidance on token vs ngram [here](/use-cases/observability/schema-design#text-index-for-full-text-search).

## Add indexes during CREATE TABLE (multiple examples) {#add-indexes-during-create-table-multiple-examples}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -373,8 +373,8 @@ The connector exposes the following additional metrics on top of Flink's existin
## Advanced and recommended usage {#advanced-and-recommended-usage}

- For optimal performance, ensure your DataStream element type is **not** a Generic type - see [here for Flink's type distinction](https://nightlies.apache.org/flink/flink-docs-release-2.2/docs/dev/datastream/fault-tolerance/serialization/types_serialization/#flinks-typeinformation-class). Non-generic elements will avoid the serialization overhead incurred by Kryo and improve throughput to ClickHouse.
- We recommend setting `maxBatchSize` to at least 1000 and ideally between 10,000 to 100,000. See [this guide on bulk inserts](https://clickhouse.com/docs/optimize/bulk-inserts) for more information.
- To do OLTP-style deduplication or upsert to ClickHouse, refer to [this documentation page](https://clickhouse.com/docs/guides/developer/deduplication#options-for-deduplication). _Note: this is not to be confused with batch deduplication that happens on retries, detailed [below](#duplicate_batches)._
- We recommend setting `maxBatchSize` to at least 1000 and ideally between 10,000 to 100,000. See [this guide on bulk inserts](/optimize/bulk-inserts) for more information.
- To do OLTP-style deduplication or upsert to ClickHouse, refer to [this documentation page](/guides/developer/deduplication#options-for-deduplication). _Note: this is not to be confused with batch deduplication that happens on retries._

## Troubleshooting {#troubleshooting}

Expand Down
6 changes: 3 additions & 3 deletions docs/integrations/data-ingestion/clickpipes/kafka/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ integration:
<!--AUTOGENERATED_START-->
| Page | Description |
|-----|-----|
| [Creating your first Kafka ClickPipe](/integrations/clickpipes/kafka/create-your-first-kafka-clickpipe) | Step-by-step guide to creating your first Kafka ClickPipe. |
| [Schema registries for Kafka ClickPipe](/integrations/clickpipes/kafka/schema-registries) | How to integrate for ClickPipes with a schema registry for schema management |
| [Reference](/integrations/clickpipes/kafka/reference) | Details supported formats, sources, delivery semantics, authentication and experimental features supported by Kafka ClickPipes |
| [Best practices](/integrations/clickpipes/kafka/best-practices) | Details best practices to follow when working with Kafka ClickPipes |
| [Schema registries for Kafka ClickPipe](/integrations/clickpipes/kafka/schema-registries) | How to integrate for ClickPipes with a schema registry for schema management |
| [Creating your first Kafka ClickPipe](/integrations/clickpipes/kafka/create-your-first-kafka-clickpipe) | Step-by-step guide to creating your first Kafka ClickPipe. |
| [Kafka ClickPipes FAQ](/integrations/clickpipes/kafka/faq) | Frequently asked questions about ClickPipes for Kafka |
| [Best practices](/integrations/clickpipes/kafka/best-practices) | Details best practices to follow when working with Kafka ClickPipes |
<!--AUTOGENERATED_END-->
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ Yes. Your source Postgres and destination ClickHouse have independent retention.

### How can I enrich data as it flows from Postgres to ClickHouse? {#data-enrichment}

Use [materialized views](/materialized-view) on top of your CDC destination tables. Materialized views in ClickHouse act as insert triggers, so each row replicated from Postgres can be transformed, joined with lookup tables, or enriched with additional columns before being written to a final target table.
Use [materialized views](/materialized-views) on top of your CDC destination tables. Materialized views in ClickHouse act as insert triggers, so each row replicated from Postgres can be transformed, joined with lookup tables, or enriched with additional columns before being written to a final target table.

### Can I replicate from multiple Postgres instances into one or more ClickHouse services? {#multi-region-multi-source}

Expand Down
2 changes: 1 addition & 1 deletion docs/integrations/data-ingestion/etl-tools/dbt/guides.md
Original file line number Diff line number Diff line change
Expand Up @@ -996,7 +996,7 @@

## Using seeds {#using-seeds}

dbt provides the ability to load data from CSV files. This capability isn't suited to loading large exports of a database and is more designed for small files typically used for code tables and [dictionaries](../../../../sql-reference/dictionaries/index.md), e.g. mapping country codes to country names. For a simple example, we generate and then upload a list of genre codes using the seed functionality.
dbt provides the ability to load data from CSV files. This capability isn't suited to loading large exports of a database and is more designed for small files typically used for code tables and [dictionaries](/dictionary), e.g. mapping country codes to country names. For a simple example, we generate and then upload a list of genre codes using the seed functionality.

Check notice on line 999 in docs/integrations/data-ingestion/etl-tools/dbt/guides.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Ability

Suggestion: Try to replace ('ability to') with more precise language, unless this content is about security. See the word list for details.

1. We generate a list of genre codes from our existing dataset. From the dbt directory, use the `clickhouse-client` to create a file `seeds/genre_codes.csv`:

Expand Down
2 changes: 1 addition & 1 deletion docs/integrations/data-ingestion/etl-tools/dbt/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,8 +31,8 @@ dbt is compatible with ClickHouse through a [ClickHouse-supported adapter](https
|-----|-----|
| [Features and Configurations](/integrations/dbt/features-and-configurations) | Description of the features and general configurations available |
| [Materializations](/integrations/dbt/materializations) | Materializations available and their configurations |
| [Guides](/integrations/dbt/guides) | Guides for using dbt with ClickHouse |
| [Materialization: materialized_view](/integrations/dbt/materialization-materialized-view) | Specific documentation for the materialized_view materialization |
| [Guides](/integrations/dbt/guides) | Guides for using dbt with ClickHouse |
<!--AUTOGENERATED_END-->

## Supported features {#supported-features}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@

<ClickHouseSupportedBadge/>

A `materialized_view` materialization should be a `SELECT` from an existing (source) table. Unlike PostgreSQL, a ClickHouse materialized view is not "static" (and has no corresponding REFRESH operation). Instead, it acts as an **insert trigger**, inserting new rows into a target table by applying the defined `SELECT` transformation on rows inserted into the source table. See the [ClickHouse materialized view documentation](/materialized-view) for more details on how materialized views work in ClickHouse.
A `materialized_view` materialization should be a `SELECT` from an existing (source) table. Unlike PostgreSQL, a ClickHouse materialized view is not "static" (and has no corresponding REFRESH operation). Instead, it acts as an **insert trigger**, inserting new rows into a target table by applying the defined `SELECT` transformation on rows inserted into the source table. See the [ClickHouse materialized view documentation](/docs/materialized-views) for more details on how materialized views work in ClickHouse.

Check notice on line 17 in docs/integrations/data-ingestion/etl-tools/dbt/materialization-materialized-view.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Contractions

Suggestion: Use 'isn't' instead of 'is not'.

:::note
For general materialization concepts and shared configurations (engine, order_by, partition_by, etc.), see the [Materializations](/integrations/dbt/materializations) page.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -335,7 +335,7 @@ parameters of the model config.

The `materialized_view` materialization creates a ClickHouse [materialized view](/sql-reference/statements/create/view#materialized-view) that acts as an insert trigger, automatically transforming and inserting new rows from a source table into a target table. This is one of the most powerful materializations available in dbt-clickhouse.

Due to its depth, this materialization has its own dedicated page. **[Go to the Materialized Views guide](/integrations/dbt/materialized-views)** for the full documentation
Due to its depth, this materialization has its own dedicated page. **[Go to the Materialized Views guide](/integrations/dbt/materialization-materialized-view)** for the full documentation

## Materialization: dictionary (experimental) {#materialization-dictionary}

Expand Down
Loading
Loading