Skip to content

Conversation

@peachdawnleach
Copy link
Contributor

Fixes DOC-15407

Added info from Confluence on how to manually apply dlq rows

Added section on how to manually resolve entries in the dlq
typo
@github-actions
Copy link

github-actions bot commented Dec 2, 2025

@netlify
Copy link

netlify bot commented Dec 2, 2025

Deploy Preview for cockroachdb-interactivetutorials-docs canceled.

Name Link
🔨 Latest commit ae3ab5a
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-interactivetutorials-docs/deploys/693877635ef52b0008fcdc73

@netlify
Copy link

netlify bot commented Dec 2, 2025

Deploy Preview for cockroachdb-api-docs canceled.

Name Link
🔨 Latest commit ae3ab5a
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-api-docs/deploys/69387763fb10730009a5b6f6

@netlify
Copy link

netlify bot commented Dec 2, 2025

Netlify Preview

Name Link
🔨 Latest commit ae3ab5a
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs/deploys/69387763a81bde00089cf6e8
😎 Deploy Preview https://deploy-preview-21440--cockroachdb-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@peachdawnleach
Copy link
Contributor Author

106677386757203 | 2025-04-25 25:32:28.435439+00 | {"created_at": "2025-04-25:35:00.499499", "payload": "blahblahblah=", "my_id": 207}
~~~

1. Check the value of the row and the replicated time:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: like in the kb, it might be clearer to the user if you state what the purpose of the next 3 steps are before showing the 3 steps: e.g. Check if the value of the row on source and destination are the same. If they are, the DLQ’d row can be removed:

@peachdawnleach peachdawnleach force-pushed the 20251125-doc-15407-manually-apply-dlq-rows branch from ff5e88f to de022e2 Compare December 8, 2025 15:33
@netlify
Copy link

netlify bot commented Dec 8, 2025

Deploy Preview for cockroachdb-docs-noble-test failed. Why did it fail? →

Name Link
🔨 Latest commit de022e2
🔍 Latest deploy log https://app.netlify.com/projects/cockroachdb-docs-noble-test/deploys/6936efd0428436000877335d

…om:cockroachdb/docs into 20251125-doc-15407-manually-apply-dlq-rows
@peachdawnleach peachdawnleach force-pushed the 20251125-doc-15407-manually-apply-dlq-rows branch from de022e2 to 4315294 Compare December 8, 2025 16:24
Copy link
Contributor

@jhlodin jhlodin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some feedback to help clarify the process and what objects we're talking about. Should be specific when referring to "source" and "destination" whether we're talking about a table or a cluster.

)
~~~

#### Resolve rows in the DLQ
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
#### Resolve rows in the DLQ
#### Manage rows in the DLQ

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels better as a verb for the header, since it's not clear from the previous section that anything is in an "unresolved" state.


#### Resolve rows in the DLQ

LDR does not pause when writes are sent to the DLQ. You must manage the DLQ manually by examining each entry in the DLQ and either manually reinserting the row or deleting the entry from the DLQ. If you have multiple DLQ entries, resolve them in order from most recent to least recent.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LDR does not pause when writes are sent to the DLQ. You must manage the DLQ manually by examining each entry in the DLQ and either manually reinserting the row or deleting the entry from the DLQ. If you have multiple DLQ entries, resolve them in order from most recent to least recent.
LDR does not pause when writes are sent to the DLQ. You must manage the DLQ manually by examining each entry in the DLQ and either reinserting the row or deleting the entry from the DLQ. If you have multiple DLQ entries, resolve them in order from most recent to least recent.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2x "manually"


LDR does not pause when writes are sent to the DLQ. You must manage the DLQ manually by examining each entry in the DLQ and either manually reinserting the row or deleting the entry from the DLQ. If you have multiple DLQ entries, resolve them in order from most recent to least recent.

To resolve a row in the DLQ:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To resolve a row in the DLQ:
To resolve an entry in the DLQ:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Section uses "row" and "entry" interchangeably, should stick to one


To resolve a row in the DLQ:

1. On the destination, find the primary key value in the `incoming_row` column.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. On the destination, find the primary key value in the `incoming_row` column.
1. On the destination cluster's DLQ table, find the primary key value in the `incoming_row` column.

106677386757203 | 2025-04-25 25:32:28.435439+00 | {"created_at": "2025-04-25:35:00.499499", "payload": "blahblahblah=", "my_id": 207}
~~~

1. Determine whether the value of the row matches on the source and the destination:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. Determine whether the value of the row matches on the source and the destination:
1. Determine whether the value of the row in the DLQ matches the values on the source and destination tables respectively:


1. Determine whether the value of the row matches on the source and the destination:

1. Check the value of the row and the replicated time:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarify that this is run on the DLQ table on the destination cluster

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: The parent step reads "source and destination" in that order but these are ordered destination then source. Suggest making those consistent

SELECT replicated_time FROM show logical replication jobs;
~~~

1. On the source, check the value of the row as of the replicated time:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source cluster

DELETE FROM crdb_replication.dlq_271_foo WHERE id = 106677386757203;
~~~

1. If the row's value on the destination is different from its value on the source, but the row's value on the source equals its value in the DLQ, update the row on the destination to have the same value as on the source:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same state as the following step, sounds like one of them is meant to say "but the row's value on the source is also different from its value in the DLQ" or similar.


{% include_cached copy-clipboard.html %}
~~~ sql
SELECT * FROM foo WHERE my_id = 207;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be more clear throughout about whether these operations are taking place on the source table, the destination table, or the DLQ for the table. Can we use fully-qualified table names here with corresponding DB names in order to make it more clear? Or are the fully-qualified names going to be identical regardless of which cluster it's on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, since I would think the two clusters would be as identical as possible- @msbutler do you have an opinion?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be fine to use fully qualified names in the example queries. i.e. sourceDB.Foo and destDB.Foo

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(aside: you are allowed to run LDR within one cluster from one db to another db)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comments as above

@peachdawnleach
Copy link
Contributor Author

@msbutler working on clarifying some things in here based on docs review, and wondering- in a DLQ entry, is "payload" the value of the row that it's trying to insert? And is that formatted as like "column=value" ? Or just "value"? Or some other way? Thanks!

@msbutler
Copy link

msbutler commented Dec 9, 2025

@peachdawnleach ah good question, In this toy example table, incoming_row contains the value for the pk identified by the column my_id, as well as columns created_at and payload. I updated my kb runbook with this info. So, incoming_row contains the values of the columns of the DLQ’d row.

@peachdawnleach
Copy link
Contributor Author

@msbutler Awesome, that makes sense. So in that case, why is it that when performing the UPSERT in a later step, the values are in a different order than they are in the DLQ row? Is it just that the pk always comes first?

@peachdawnleach
Copy link
Contributor Author

@msbutler Also, this PR could generally use a re-review when you have a chance - a lot ended up changing here

@peachdawnleach peachdawnleach requested review from msbutler and removed request for alicia-l2 December 9, 2025 20:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants