Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 2 additions & 4 deletions docs/integrations/data-ingestion/clickpipes/postgres/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -193,7 +193,7 @@

### How should I scope my publications when setting up replication? {#how-should-i-scope-my-publications-when-setting-up-replication}

You can let ClickPipes manage your publications (requires additional permissions) or create them yourself. With ClickPipes-managed publications, we automatically handle table additions and removals as you edit the pipe. If self-managing, carefully scope your publications to only include tables you need to replicate - including unnecessary tables will slow down Postgres WAL decoding.
Ensure that all tables you plan to replicate are added to the publication before adding them to the pipe. Avoid using `FOR ALL TABLES` unless you intend to replicate every table in the database. Otherwise, Postgres will decode and send WAL changes for all tables, including those not in the pipe, which increases load on the source database and reduces replication efficiency.

Check notice on line 196 in docs/integrations/data-ingestion/clickpipes/postgres/faq.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Uppercase

Suggestion: Instead of uppercase for 'WAL', use lowercase or backticks (`) if possible. Otherwise, ask a Technical Writer to add this word or acronym to the rule's exception list.

Check warning on line 196 in docs/integrations/data-ingestion/clickpipes/postgres/faq.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.

Check warning on line 196 in docs/integrations/data-ingestion/clickpipes/postgres/faq.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.

If you include any table in your publication, make sure it has either a primary key or `REPLICA IDENTITY FULL`. If you have tables without a primary key, creating a publication for all tables will cause DELETE and UPDATE operations to fail on those tables.

Expand Down Expand Up @@ -226,9 +226,7 @@
```

:::tip
If you're creating a publication manually instead of letting ClickPipes manage it, we don't recommend creating a publication `FOR ALL TABLES`, this leads to more traffic from Postgres to ClickPipes (to sending changes for other tables not in the pipe) and reduces overall efficiency.

For manually created publications, please add any tables you want to the publication before adding them to the pipe.
Alternatively, ClickPipes can manage publications on your behalf, automatically handling table additions and removals as you modify the pipe. Note that this requires granting the database user both the `CREATE` permission on the database and ownership of the source tables, which elevates the user beyond read-only access. We recommend managing publications manually to minimize the permission surface of the database user.

Check notice on line 229 in docs/integrations/data-ingestion/clickpipes/postgres/faq.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.Wordy

Suggestion: Remove the phrase 'note that'.

Check warning on line 229 in docs/integrations/data-ingestion/clickpipes/postgres/faq.md

View workflow job for this annotation

GitHub Actions / vale

ClickHouse.OxfordComma

Use a comma before the last 'and' or 'or' in a list of four or more items.
:::

:::warning
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -103,6 +103,14 @@ Connect to your AlloyDB instance as an admin user and execute the following comm

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Configure network access {#configure-network-access}

:::note
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,14 @@ Connect to your Aurora PostgreSQL writer instance as an admin user and execute t

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Configure network access {#configure-network-access}

### IP-based access control {#ip-based-access-control}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,14 @@ Connect to your Azure Flexible Server Postgres through the admin user and run th

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

5. Set `wal_sender_timeout` to 0 for `clickpipes_user`

```sql
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,14 @@ Connect to your Crunchy Bridge Postgres through the `postgres` user and run the

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Safe list ClickPipes IPs {#safe-list-clickpipes-ips}

Safelist [ClickPipes IPs](../../index.md#list-of-static-ips) by adding the Firewall Rules in Crunchy Bridge.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,14 @@ Connect to your Postgres instance as an admin user and execute the following com

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Enabling connections in pg_hba.conf to the ClickPipes User {#enabling-connections-in-pg_hbaconf-to-the-clickpipes-user}

If you're self serving, you need to allow connections to the ClickPipes user from the ClickPipes IP addresses by following the below steps. If you're using a managed service, you can do the same by following the provider's documentation.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,14 @@ Connect to your Cloud SQL Postgres through the admin user and run the below comm

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

[//]: # (TODO Add SSH Tunneling)

## Add ClickPipes IPs to Firewall {#add-clickpipes-ips-to-firewall}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -65,6 +65,14 @@ Connect to your Neon instance as an admin user and execute the following command

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Enable logical replication {#enable-logical-replication}
In Neon, you can enable logical replication through the UI. This is necessary for ClickPipes's CDC to replicate data.
Head over to the **Settings** tab and then to the **Logical Replication** section.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,14 @@ Connect to your PlanetScale Postgres instance using the default `postgres.<...>`

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Caveats {#caveats}
1. To connect to PlanetScale Postgres, the current branch needs to be appended to the username created above. For example, if the created user was named `clickpipes_user`, the actual user provided during the ClickPipe creation needs to be `clickpipes_user`.`branch` where `branch` refers to the "id" of the current PlanetScale Postgres [branch](https://planetscale.com/docs/postgres/branching). To quickly determine this, you can refer to the username of the `postgres` user you used to create the user earlier, the part after the period would be the branch id.
2. Don't use the `PSBouncer` port (currently `6432`) for CDC pipes connecting to PlanetScale Postgres, the normal port `5432` must be used. Either port may be used for initial-load only pipes.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,14 @@ Connect to your RDS Postgres instance as an admin user and execute the following

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Configure network access {#configure-network-access}

### IP-based access control {#ip-based-access-control}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,14 @@ Connect to your Supabase instance as an admin user and execute the following com

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

## Increase `max_slot_wal_keep_size` {#increase-max_slot_wal_keep_size}

:::warning
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,14 @@ If you'd like to only perform a one-time load of your data (`Initial Load Only`)

The `clickpipes` publication will contain the set of change events generated from the specified tables, and will later be used to ingest the replication stream.

:::warning
Avoid using `FOR ALL TABLES` unless you intend to replicate every table. Including unnecessary tables increases WAL traffic from Postgres to ClickPipes and reduces overall replication efficiency.
:::

:::note
ClickPipes can automatically create and manage the publication on your behalf. However, this requires granting the ClickPipes user both table ownership and the `CREATE` permission on the database. If you prefer read-only access for the ClickPipes user, we recommend creating and managing the publication manually.
:::

After these steps, you should be able to proceed with [creating a ClickPipe](../index.md).

## Configure network access {#configure-network-access}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,9 @@ You can set the `REPLICA IDENTITY` to `FULL` using the following SQL command:
ALTER TABLE your_table_name REPLICA IDENTITY FULL;
```

Refer to [this blog post](https://xata.io/blog/replica-identity-full-performance) for performance considerations when setting `REPLICA IDENTITY FULL`.
:::warning
Setting `REPLICA IDENTITY FULL` resolves the data correctness issue but can significantly increase WAL volume and degrade performance on the source database, as PostgreSQL must write the entire old row for every UPDATE and DELETE. Evaluate the trade-offs carefully, especially for high-throughput tables with large TOAST columns. Refer to [this blog post](https://xata.io/blog/replica-identity-full-performance) for more details on performance considerations when setting `REPLICA IDENTITY FULL`.
:::

## Replication behavior when REPLICA IDENTITY FULL isn't set {#replication-behavior-when-replica-identity-full-is-not-set}

Expand Down