Skip to content

eventstore: optimize event store SST file filtering#4984

Open
lidezhu wants to merge 15 commits into
masterfrom
ldz/optimize-event-store0506
Open

eventstore: optimize event store SST file filtering#4984
lidezhu wants to merge 15 commits into
masterfrom
ldz/optimize-event-store0506

Conversation

@lidezhu
Copy link
Copy Markdown
Collaborator

@lidezhu lidezhu commented May 2, 2026

What problem does this PR solve?

Issue Number: close #4939

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Questions

Will it cause performance regression or break compatibility?
Do you need to update user documentation, design documentation or monitoring documentation?

Release note

Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.

If you don't think this PR needs a release note then fill it with `None`.

Summary by CodeRabbit

Performance

  • Event store queries filtering by commit timestamp now execute more efficiently by reducing unnecessary data scanning and I/O operations.

Monitoring

  • Added new observability metrics to track event store scan optimization decisions and the volume of data processed during queries.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 2, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 214dfb98-5d8e-43c8-b48b-5e1e724cf678

📥 Commits

Reviewing files that changed from the base of the PR and between d932144 and 311ea0f.

📒 Files selected for processing (5)
  • logservice/eventstore/event_store.go
  • logservice/eventstore/pebble.go
  • logservice/eventstore/pebble_test.go
  • logservice/eventstore/table_properties.go
  • pkg/metrics/event_store.go

📝 Walkthrough

Walkthrough

This PR implements SST-level filtering by transaction commit timestamp for TiCDC's log service event store. The changes add Pebble table property collection and filtering to reduce unnecessary SST file scans during iterator operations, with Prometheus metrics for observability.

Changes

SST Commit-TS Filtering Optimization

Layer / File(s) Summary
Table Property Schema
logservice/eventstore/table_properties.go
Defines constants for per-SST min/max commit timestamp properties and collector metadata.
Commit-TS Collector
logservice/eventstore/table_properties.go
Implements eventStoreTxnCommitTsCollector to extract and track commit-ts ranges and logical bytes from keys during SST creation.
SST Filter Predicate
logservice/eventstore/table_properties.go
Implements newEventStoreSSTFileFilter to decide SST scanning based on overlap between stored min/max commit-ts and query range bounds.
Metrics Support
logservice/eventstore/table_properties.go
Records filter decisions and affected logical bytes; safely parses uint64 property values with fallback handling.
Pebble Configuration
logservice/eventstore/pebble.go
Registers newEventStoreTxnCommitTsCollector with Pebble's table property collectors.
Iterator Integration
logservice/eventstore/event_store.go
GetIterator computes lowerTs bound and applies table filter via newEventStoreSSTFileFilter to reduce SST files scanned.
Prometheus Metrics
pkg/metrics/event_store.go
Defines EventStoreSSTFileFilterCount and EventStoreSSTFileFilterLogicalBytes counter vectors with decision label.
Test & Imports
logservice/eventstore/pebble_test.go
Adds TestEventStoreTxnCommitTsCollector covering collector lifecycle, property emission, and filter behavior with edge cases.

Sequence Diagram

sequenceDiagram
  participant Client
  participant GetIterator as event_store.GetIterator
  participant Filter as newEventStoreSSTFileFilter
  participant Pebble
  participant Collector as EventStoreTxnCommitTsCollector
  
  Client->>GetIterator: GetIterator(dataRange)
  GetIterator->>GetIterator: computeLowerTs(LastScannedTxnStartTs)
  GetIterator->>Filter: newEventStoreSSTFileFilter(lowerTs, upperTs)
  GetIterator->>Pebble: NewIter(LowerBound, UpperBound, TableFilter)
  
  Pebble->>Pebble: Evaluate TableFilter for each SST
  Pebble->>Collector: During compaction: Add(key, value)
  Collector->>Collector: extractCommitTs(key)
  Collector->>Collector: updateMinMaxRange()
  Pebble->>Collector: Finish(userProps)
  Collector->>Pebble: Write min/max/logicalBytes to props
  
  Pebble->>Filter: CanFilter(SST properties)
  Filter->>Filter: Parse min/maxTs from properties
  Filter->>Filter: Check [minTs, maxTs] overlap [lowerTs, upperTs]
  Filter->>Pebble: true (scan) or false (skip)
  
  Pebble-->>Client: Iterator over qualifying SSTs
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • pingcap/ticdc#4985: Modifies the same GetIterator function to return (EventIterator, error) and handle iterator errors — would interact with this PR's filter changes.
  • pingcap/ticdc#4953: Modifies iterator bounds and key scan semantics in GetIterator — directly related to query range filtering logic.

Suggested labels

lgtm, approved, size/XL

Suggested reviewers

  • wk989898
  • hongyunyan
  • 3AceShowHand

Poem

🐰 A filter blooms in Pebble's store,
Commit-ts bounds to skip before,
No need to scan the distant shelves,
When timestamps tell us "not ourselves,"
SST files now choose their fate! 🌟

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description lacks critical implementation details. While it links issue #4939, the 'What is changed and how it works?' section is empty, no clear explanation of the optimization approach is provided, and no release notes are included. Fill in the 'What is changed and how it works?' section with details about SST table property collection, commit timestamp filtering, and metric recording. Add a release note describing the performance improvement.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'eventstore: optimize event store SST file filtering' clearly and concisely summarizes the main change of adding SST file filtering optimization to the event store.
Linked Issues check ✅ Passed The code changes meet the primary objective of issue #4939 by implementing PebbleDB optimization through SST file filtering based on transaction commit timestamp ranges, reducing unnecessary file scans.
Out of Scope Changes check ✅ Passed All changes are directly related to the SST file filtering optimization objective: table property collection, filtering logic, iterator integration, metrics, and unit tests are all in scope.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch ldz/optimize-event-store0506

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.1)

Command failed


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@lidezhu lidezhu changed the title eventstore: add event store sst file filtering eventstore: optimize event store SST file filtering May 2, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements SST file filtering for the event store by introducing a custom Pebble table property collector to track transaction commit timestamps, optimizing read performance. Feedback suggests handling potential errors from db.NewIter to prevent panics and optimizing metrics collection within the filter closure by pre-fetching Prometheus counters to reduce overhead.

Comment thread logservice/eventstore/event_store.go Outdated
Comment on lines 928 to 935
iter, _ := db.NewIter(&pebble.IterOptions{
LowerBound: start,
UpperBound: end,
TableFilter: newEventStoreSSTFileFilter(
lowerTs,
dataRange.CommitTsEnd,
),
})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The error returned by db.NewIter is ignored. If an error occurs (e.g., due to invalid bounds if CommitTsStart > CommitTsEnd), iter will be nil, which will cause a panic when iter.First() is called on line 939. It is recommended to handle the error and return early to avoid a potential crash.

Suggested change
iter, _ := db.NewIter(&pebble.IterOptions{
LowerBound: start,
UpperBound: end,
TableFilter: newEventStoreSSTFileFilter(
lowerTs,
dataRange.CommitTsEnd,
),
})
iter, err := db.NewIter(&pebble.IterOptions{
LowerBound: start,
UpperBound: end,
TableFilter: newEventStoreSSTFileFilter(
lowerTs,
dataRange.CommitTsEnd,
),
})
if err != nil {
log.Error("failed to create pebble iterator",
zap.Stringer("dispatcherID", dispatcherID),
zap.Error(err))
return nil
}

Comment thread logservice/eventstore/table_properties.go
Base automatically changed from ldz/optimize-event-store0429 to master May 7, 2026 00:38
@ti-chi-bot ti-chi-bot Bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 7, 2026
@ti-chi-bot ti-chi-bot Bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 9, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented May 9, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign charlescheung96 for approval. For more information see the Code Review Process.
Please ensure that each of them provides their approval before proceeding.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Optimize performance by adjusting PebbleDB options

1 participant