doc/design: add design doc for incremental, occ-based read-then-write by aljoscha · Pull Request #34977 · MaterializeInc/materialize

aljoscha · 2026-02-10T19:18:51Z

Rendered: https://github.com/aljoscha/materialize/blob/design-incremental-occ-read-then-write/doc/developer/design/20260210_incremental_occ_read_then_write.md

bkirwi

Love this direction!

bkirwi · 2026-02-10T20:31:38Z

doc/developer/design/20260210_incremental_occ_read_then_write.md

+
+- If the timestamp is still valid: the write is committed at exactly that
+  timestamp, and the oracle is advanced past it. Any concurrent OCC loops that
+  were targeting the same timestamp will fail and retry.


Do we have to worry about how table commit advances the logical frontier for all tables? At first blush it seems like you might get spurious conflicts . (Where eg. my write to table A advanced the frontier for table B, so my write attempt to B fails even though it hasn't changed.)

In theory the txn mechanism has enough metadata in its shard to know that there are not actually any changes in B between my read time and the new frontier, so it was safe to advance the write timestamp further. But I suppose we're going to find that out via the subscribe pretty quickly anyways...

Yeah, any table write will clobber your write attempt. And as is in the prototype, you would learn through the subscribe that nothing changed and re-try. Not ideal, but also ... 🤷

teskje · 2026-02-12T00:13:42Z

doc/developer/design/20260210_incremental_occ_read_then_write.md

+This removes a significant amount of complexity and uncertainty from the
+codebase.
+
+## Alternative: distributed locking service


An alternative I'd like to see discussed is having the OCC loop run in clusterd as a dataflow export. On first glance, that seems attractive because it avoids having to stash all the data read in environmentd (which might oom or abort the read because its response is too large). I imagine it might not work because table writes have to be performed by envd, possibly due to some txn-wal reason?

Well, I think we eventually want to make multiple envds able to submit read-then-writes concurrently, in which case whatever multiple envds can cooperate on, a clusterd should be able to do as well, right? E.g., clusterds should also able to access the timestamp oracle.

ggevay

LGTM, like it!

(As discussed offline, this will also unblock the removal of the old peek sequencing code, because the current ReadThenWrite code calls the old peek sequencing from the Coordinator.)

ggevay · 2026-02-16T12:28:13Z

doc/developer/design/20260210_incremental_occ_read_then_write.md

+
+The goal is not to make writes faster, but to not regress significantly.
+Benchmarking a PoC-level implementation of the OCC approach against `main` for
+`UPDATE t SET x = x + 1` shows the following:


Peeks can be faster than subscribes in various situations:

If there is an index, then a fast-path peek can avoid building a dataflow.

We can do monotonic TopKs and min/max.

Anyhow, if this turns out that it matters for some users, then a follow-up optimization could be to first try a peek-and-write-at-the-same-timestamp instead of a subscribe, and fall back to the subscribe-based approach only if the first write doesn't complete fast enough and fails.

ggevay · 2026-02-16T12:44:06Z

doc/developer/design/20260210_incremental_occ_read_then_write.md

+3. After committing at timestamp T, the oracle advances past T, so additional
+   writes at T would fail anyway. We fail them early to avoid unnecessary work.
+
+### Comparison with the old approach


Maybe somewhat of a corner case, but a slight concern is what happens when the subscribe can't keep up with input changes, and thus the operation can never complete. With the old approach, the operation would usually either complete after a while or OOM the cluster, making some noise. In contrast, the new approach can silently get into a state where it can never finish, e.g., if there is some window function or cross join that keeps rewriting a significant portion of the subscribe's output at every input change. Maybe we could try to detect if the subscribe just keeps falling behind, and show a notice to the user, or even entirely fail the operation.

ggevay · 2026-02-16T13:12:31Z

One more thought: the current ReadThenWrite code has the limitation that even the read part is not allowed to refer to sources, only to tables. (See AdapterError::InvalidTableMutationSelection.) This is because we can't put a lock on sources, so source contents might change by the time we complete the read and then choose a write timestamp. However, with the new approach, lifting this limitation might be possible? Edit: or maybe even trivial, because the read timestamp will simply be the same as the write timestamp, so the sources won't move forward between them.

doc/design: add design doc for incremental, occ-based read-then-write

59b0363

bkirwi reviewed Feb 10, 2026

View reviewed changes

teskje reviewed Feb 12, 2026

View reviewed changes

ggevay approved these changes Feb 16, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc/design: add design doc for incremental, occ-based read-then-write#34977

doc/design: add design doc for incremental, occ-based read-then-write#34977
aljoscha wants to merge 1 commit intoMaterializeInc:mainfrom
aljoscha:design-incremental-occ-read-then-write

aljoscha commented Feb 10, 2026

Uh oh!

bkirwi left a comment

Uh oh!

bkirwi Feb 10, 2026

Uh oh!

aljoscha Feb 10, 2026

Uh oh!

teskje Feb 12, 2026

Uh oh!

ggevay Feb 16, 2026 •

edited

Loading

Uh oh!

ggevay left a comment •

edited

Loading

Uh oh!

ggevay Feb 16, 2026 •

edited

Loading

Uh oh!

ggevay Feb 16, 2026

Uh oh!

ggevay commented Feb 16, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

aljoscha commented Feb 10, 2026

Uh oh!

bkirwi left a comment

Choose a reason for hiding this comment

Uh oh!

bkirwi Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

aljoscha Feb 10, 2026

Choose a reason for hiding this comment

Uh oh!

teskje Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

ggevay Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ggevay Feb 16, 2026

Choose a reason for hiding this comment

Uh oh!

ggevay commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ggevay Feb 16, 2026 •

edited

Loading

ggevay left a comment •

edited

Loading

ggevay Feb 16, 2026 •

edited

Loading

ggevay commented Feb 16, 2026 •

edited

Loading