Skip to content

tso: prevent dual writers when upgrading from PD service mode#10473

Open
YuhaoZhang00 wants to merge 4 commits intotikv:masterfrom
YuhaoZhang00:tso/prevent-dual-writers-pd-service-mode
Open

tso: prevent dual writers when upgrading from PD service mode#10473
YuhaoZhang00 wants to merge 4 commits intotikv:masterfrom
YuhaoZhang00:tso/prevent-dual-writers-pd-service-mode

Conversation

@YuhaoZhang00
Copy link
Copy Markdown
Contributor

@YuhaoZhang00 YuhaoZhang00 commented Mar 20, 2026

What problem does this PR solve?

Issue Number: ref #10329

When upgrading from PD service mode (isKeyspaceGroupEnabled=false) to API service mode with a TSO microservice, there is a window where both the PD leader and the TSO microservice could write to the same TSO timestamp key, causing dual writers and potential timestamp regression.

What is changed and how does it work?

In PD service mode, SaveTimestamp now atomically checks for an existing TSO microservice primary before writing the timestamp window. If a TSO primary is detected, the write is rejected, causing the allocator to reset and resign PD leadership. This allows a PD in API service mode to take over and proxy TSO requests, preventing dual writers during rolling upgrades.

Also remove TestMixedTSODeployment which tested seamless coexistence of PD service mode and TSO microservice — a scenario not supported by the official upgrade procedure (PD restart is required).

Key changes:

  • Add checkTSOPrimary parameter to SaveTimestamp that atomically loads the TSO primary election key (/ms/{cluster_id}/tso/00000/primary) within the same etcd transaction before writing the timestamp window. When a TSO primary is detected, the write is rejected.
  • checkTSOPrimary is only true in PD service mode (!isKeyspaceGroupEnabled). In API service mode, PD uses the existing checkTSOService stop/start mechanism instead.
  • Remove TestMixedTSODeployment - seamless coexistence of PD service mode and TSO microservice is not a supported upgrade path.

Check List

Tests

  • Unit test
  • Integration test

Code changes

None.

Side effects

None.

Related changes

None.

Release note

None.

Summary by CodeRabbit

  • Bug Fixes

    • Prevents dual timestamp writers by adding an optional primary-election check that blocks timestamp writes when a TSO microservice primary is present.
    • Logging variable naming corrected.
  • Tests

    • Added tests covering the primary-election guard (success/failure cases).
    • Removed an obsolete mixed-deployment integration test and added a new integration verifying dual-writer prevention.
  • Other

    • Behavior is now configurable to enable/disable the primary check.

In PD service mode (isKeyspaceGroupEnabled=false), SaveTimestamp now
atomically checks for an existing TSO microservice primary before
writing the timestamp window. If a TSO primary is detected, the write
is rejected, causing the allocator to reset and resign PD leadership.
This allows a PD in API service mode to take over and proxy TSO
requests, preventing dual writers during rolling upgrades.

Also remove TestMixedTSODeployment which tested seamless coexistence
of PD service mode and TSO microservice — a scenario not supported
by the official upgrade procedure (PD restart is required).

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@ti-chi-bot ti-chi-bot Bot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/needs-triage-completed dco-signoff: yes Indicates the PR's author has signed the dco. contribution This PR is from a community contributor. needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Mar 20, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Mar 20, 2026

Hi @YuhaoZhang00. Thanks for your PR.

I'm waiting for a tikv member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@ti-chi-bot ti-chi-bot Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Mar 20, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 20, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7bb649b1-8805-4a1f-a66d-f4dbb16d4b12

📥 Commits

Reviewing files that changed from the base of the PR and between f00fdf2 and c67277a.

📒 Files selected for processing (1)
  • pkg/storage/endpoint/tso.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/storage/endpoint/tso.go

📝 Walkthrough

Walkthrough

Adds a checkTSOPrimary boolean propagated from Server -> Allocator -> timestampOracle -> Storage; when enabled, storage reads the TSO microservice primary election key and rejects timestamp writes if that key is set. Tests updated/added and an integration test replaced; a logging variable was renamed.

Changes

Cohort / File(s) Summary
TSO storage interface & impl
pkg/storage/endpoint/tso.go
Added checkTSOPrimary bool to TSOStorage.SaveTimestamp and (*StorageEndpoint).SaveTimestamp; when true, transactionally read TSO primary election key and reject writes if non-empty. Renamed local logging var logFildslogFields.
Allocator / oracle wiring
pkg/tso/allocator.go, pkg/tso/tso.go, pkg/tso/keyspace_group_manager.go
NewAllocator gains checkTSOPrimary param; timestampOracle adds checkTSOPrimary field and forwards it to storage via saveTimestamp. Keyspace group manager calls NewAllocator(..., false).
Server init
server/server.go
Server now passes !s.isKeyspaceGroupEnabled into tso.NewAllocator to toggle TSO-primary checking based on keyspace-group mode.
Unit tests
pkg/storage/storage_tso_test.go
Updated SaveTimestamp calls to include new boolean arg (false in existing calls). Added TestSaveTimestampCheckTSOPrimary covering behavior with/without TSO primary election key.
Integration tests
tests/integrations/tso/client_test.go
Removed TestMixedTSODeployment; added TestDualWriterPrevention that verifies PD yields TSO when a TSO microservice starts.

Sequence Diagram(s)

sequenceDiagram
    participant Server
    participant Allocator
    participant TimestampOracle
    participant StorageEndpoint
    participant Database

    Server->>Allocator: NewAllocator(..., checkTSOPrimary)
    Allocator->>TimestampOracle: initialize(checkTSOPrimary)

    Note over TimestampOracle: prepare timestamp
    TimestampOracle->>StorageEndpoint: SaveTimestamp(ts, checkTSOPrimary)

    alt checkTSOPrimary == true
        StorageEndpoint->>Database: TX read TSO primary election key
        Database-->>StorageEndpoint: key value
        alt key non-empty
            StorageEndpoint-->>TimestampOracle: return error (reject write)
        else key empty
            StorageEndpoint->>Database: TX save timestamp
            Database-->>StorageEndpoint: success
            StorageEndpoint-->>TimestampOracle: success
        end
    else checkTSOPrimary == false
        StorageEndpoint->>Database: TX save timestamp
        Database-->>StorageEndpoint: success
        StorageEndpoint-->>TimestampOracle: success
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

size/XL

Suggested reviewers

  • JmPotato
  • lhy1024
  • rleungx

Poem

🐰 I hop from server down to store,
A little flag I carry more—
If another primary holds the key,
I tuck my pen and wait politely.
Safe hops keep timestamps true and sure.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically summarizes the main change: preventing dual writers during TSO service mode upgrades, which matches the core objective of the changeset.
Description check ✅ Passed The description includes issue reference, comprehensive explanation of the problem and solution, key technical changes, test coverage details, and proper release note section, aligning well with the repository template.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@YuhaoZhang00
Copy link
Copy Markdown
Contributor Author

/cc @JmPotato PTAL

@ti-chi-bot ti-chi-bot Bot requested a review from JmPotato March 20, 2026 09:01
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Mar 20, 2026

@YuhaoZhang00: GitHub didn't allow me to request PR reviews from the following users: PTAL.

Note that only tikv members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

/cc @JmPotato PTAL

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/storage/endpoint/tso.go`:
- Around line 113-115: The error string returned in the conditional checking
tsoPrimary (the return in the if tsoPrimary != "" branch that calls
errors.Errorf) uses capitalized "PD" and a trailing punctuation; change the
message to be all lowercase with no trailing punctuation (e.g., use "pd
microservice primary exists, pd in pd service mode must yield" or a concise
lowercase variant) so it follows the repository's error-string convention.

In `@server/server.go`:
- Around line 511-514: The current approach lets the PD-service-mode etcd leader
re-acquire PD leadership after allocator yields because allocatorUpdater calls
Reset(true) (which only resigns PD leadership) while leaderLoop will re-campaign
from the current etcd leader; update the yield path so that when s.tsoAllocator
(created in s.tsoAllocator) detects an existing TSO primary and allocatorUpdater
calls Reset(true), you also either (a) prevent leaderLoop from re-campaigning
for that member by setting/clearing a "yielded" flag on the server (e.g., on s)
that leaderLoop checks before starting a new campaign, or (b) perform an etcd
leadership transfer of s.member to another peer before calling Reset(true) so
the yielding node cannot immediately win; implement the chosen approach in the
allocatorUpdater/Reset invocation and ensure leaderLoop observes the yielded
flag (or verify transfer completion) to stop immediate re-campaigns.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: dcc0a32a-3ca6-4e86-b375-6cc94420454b

📥 Commits

Reviewing files that changed from the base of the PR and between 19fed4b and 291f1dc.

📒 Files selected for processing (8)
  • pkg/storage/endpoint/tso.go
  • pkg/storage/storage_tso_test.go
  • pkg/tso/allocator.go
  • pkg/tso/keyspace_group_manager.go
  • pkg/tso/tso.go
  • pkg/utils/keypath/absolute_key_path.go
  • server/server.go
  • tests/integrations/tso/client_test.go
💤 Files with no reviewable changes (1)
  • tests/integrations/tso/client_test.go

Comment thread pkg/storage/endpoint/tso.go Outdated
Comment on lines +113 to +115
if tsoPrimary != "" {
return errors.Errorf("tso microservice primary exists, PD in PD service mode must yield")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Keep the new error string lowercase.

Line 114 introduces capitalized PD, which breaks the repo's error-string convention.

As per coding guidelines, "Error strings must be lowercase with no trailing punctuation".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/storage/endpoint/tso.go` around lines 113 - 115, The error string
returned in the conditional checking tsoPrimary (the return in the if tsoPrimary
!= "" branch that calls errors.Errorf) uses capitalized "PD" and a trailing
punctuation; change the message to be all lowercase with no trailing punctuation
(e.g., use "pd microservice primary exists, pd in pd service mode must yield" or
a concise lowercase variant) so it follows the repository's error-string
convention.

Comment thread server/server.go
Comment thread pkg/utils/keypath/absolute_key_path.go Outdated
}

// TSODefaultPrimaryPath returns the primary election path for the default keyspace group's TSO service.
func TSODefaultPrimaryPath() string {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe resue ElectionPath rather than adding a new one?

Comment thread pkg/storage/endpoint/tso.go Outdated
if err != nil {
return err
}
if tsoPrimary != "" {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer using len cause it also checks the nil case.

Suggested change
if tsoPrimary != "" {
if len(tsoPrimary) == 0 {

Comment thread tests/integrations/tso/client_test.go
Reuse ElectionPath instead of adding a redundant TSODefaultPrimaryPath
function, and use len() check for string emptiness per repo convention.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
Add TestDualWriterPrevention to verify that a PD leader in PD service
mode yields its TSO allocator when a TSO microservice primary is
detected, replacing the removed TestMixedTSODeployment which asserted
the old (unsafe) dual-writer coexistence behavior.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@YuhaoZhang00 YuhaoZhang00 requested a review from JmPotato March 29, 2026 14:26
@lhy1024
Copy link
Copy Markdown
Contributor

lhy1024 commented Apr 1, 2026

/ok-to-test

@ti-chi-bot ti-chi-bot Bot added ok-to-test Indicates a PR is ready to be tested. and removed needs-ok-to-test Indicates a PR created by contributors and need ORG member send '/ok-to-test' to start testing. labels Apr 1, 2026
@YuhaoZhang00
Copy link
Copy Markdown
Contributor Author

/retest

2 similar comments
@YuhaoZhang00
Copy link
Copy Markdown
Contributor Author

/retest

@YuhaoZhang00
Copy link
Copy Markdown
Contributor Author

/retest

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 81.48148% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.94%. Comparing base (033398b) to head (c67277a).
⚠️ Report is 29 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10473      +/-   ##
==========================================
+ Coverage   78.86%   78.94%   +0.08%     
==========================================
  Files         529      532       +3     
  Lines       71102    71871     +769     
==========================================
+ Hits        56072    56740     +668     
- Misses      11014    11106      +92     
- Partials     4016     4025       +9     
Flag Coverage Δ
unittests 78.94% <81.48%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Comment thread pkg/storage/endpoint/tso.go
groupID uint32,
ts time.Time,
leadership *election.Leadership,
checkTSOPrimary bool,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that, to pass this variable in, we've gone through many layers and modified several structures to carry additional fields.

Is there a more elegant or direct way for the storageEndpoint function to detect this variable, rather than passing it all the way through the call stack?

Copy link
Copy Markdown
Contributor Author

@YuhaoZhang00 YuhaoZhang00 Apr 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about the layering. I explored alternatives but think the current approach is the best trade-off here.

  1. The check must live inside SaveTimestamp's etcd transaction for atomicity guarantee. Plus, SaveTimestamp already does the same kind of thing: leadership checking
  2. The decision to enable it can only come from server.go. That's the only place that knows whether we're in PD service mode (!isKeyspaceGroupEnabled)
  3. Wiring it differently doesn't really help. For example, storing the flag on StorageEndpoint avoids threading it through the call stack, but StorageEndpoint is a generic base shared by all storage backends, and putting a TSO-specific config there is a different kind of layering issue.

One alternative I considered is to a callback hook on StorageEndpoint: Add a preSaveTimestampCheck func(txn kv.Txn) error on StorageEndpoint. server.go installs a closure that checks for the TSO primary; SaveTimestamp calls it inside the transaction if non-nil.
- Pro: endpoint/tso.go has zero knowledge of TSO primaries.
- Con: Adds indirection. Also requires either a type assertion or changing NewStorageWithEtcdBackend's return type to expose the setter, which breaks encapsulation in a different way.

Signed-off-by: Yuhao Zhang <yhzhang00@outlook.com>
@YuhaoZhang00 YuhaoZhang00 requested a review from JmPotato April 3, 2026 14:37
// writing to avoid dual writers during a rolling upgrade.
if checkTSOPrimary {
tsoPrimaryPath := keypath.ElectionPath(&keypath.MsParam{ServiceName: constant.TSOServiceName})
tsoPrimary, err := txn.Load(tsoPrimaryPath)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the performance impact of this operation?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1 single SaveTimestamp() method call, this query adds 1 extra single-key read in the same etcd transaction that already does the leadership check and timestamp-window persistence. So the incremental cost here is just one more key lookup before commit.

SaveTimestamp() only runs roughly once per TSOSaveInterval (5s) in steady state. This is not on the GetTS hot path.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the first save time needs to check the TSO primary. Can the latter ignore this if the first save is successful?

@YuhaoZhang00 YuhaoZhang00 requested a review from liyishuai April 7, 2026 05:52
@ti-chi-bot ti-chi-bot Bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved and removed do-not-merge/needs-triage-completed labels Apr 8, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Apr 8, 2026

@liyishuai: adding LGTM is restricted to approvers and reviewers in OWNERS files.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@YuhaoZhang00
Copy link
Copy Markdown
Contributor Author

/hold

@ti-chi-bot ti-chi-bot Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 10, 2026
@ti-chi-bot ti-chi-bot Bot added the lgtm label Apr 10, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Apr 10, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bufferflies, JmPotato, liyishuai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [JmPotato,bufferflies]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 10, 2026
@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Apr 10, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-04-08 03:35:34.524610118 +0000 UTC m=+927339.729970174: ☑️ agreed by JmPotato.
  • 2026-04-10 08:13:04.12328338 +0000 UTC m=+1116789.328643427: ☑️ agreed by bufferflies.

@YuhaoZhang00
Copy link
Copy Markdown
Contributor Author

/retest

@ti-chi-bot
Copy link
Copy Markdown
Contributor

ti-chi-bot Bot commented Apr 10, 2026

@YuhaoZhang00: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-error-log-review c67277a link true /test pull-error-log-review

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved contribution This PR is from a community contributor. dco-signoff: yes Indicates the PR's author has signed the dco. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm ok-to-test Indicates a PR is ready to be tested. release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants