Skip to content

Fix/sortition stacks tip burn view#6881

Merged
jcnelson merged 18 commits intostacks-network:release/3.3.0.0.6from
jcnelson:fix/sortition-stacks-tip-burn-view
Mar 9, 2026
Merged

Fix/sortition stacks tip burn view#6881
jcnelson merged 18 commits intostacks-network:release/3.3.0.0.6from
jcnelson:fix/sortition-stacks-tip-burn-view

Conversation

@jcnelson
Copy link
Copy Markdown
Member

This fixes part of the chain stall from the night of 7/8 Feb 2026. This patch alters the way the sortition DB memoizes the canonical Stacks chain tip by pairing each Stacks chain tip with the sortition of its burn view instead of whatever the canonical sortition tip happened to be.

The reason for this change is because it ensures that the canonical Stacks tip can be found even if a Bitcoin fork is processed before all of the preceding tenure's Stacks blocks can be processed. For example, suppose there are four sortitions: A, B, B', and C'. B and B' both descend from A, and C' descends from B'.

Now, suppose that the Stacks blocks with burn view A -- A1 through A6 -- are processed. Once A4 is processed, the stacks_chain_tips table would have assigned sortition A to A4. That is, the canonical Stacks tip at the canonical sortition tip is A4.

But now suppose that in-between when A4 and A5 are processed, the Stacks node processes sortition B. In the old code, the stacks_chain_tips table would have assigned sortition B to A6. That is, the canonical Stacks tip as of canonical sortition B would be A6. In this patch, there would be no entry in stacks_chain_tips for B; instead, the row for sortition A would be updated to be A6 (this does not change the behavior of the node, thus far).

But now, suppose no Stacks blocks are mined in the B' tenure, and the node processes sortition C'. In the old code, when the node processed B', its stacks_chain_tips row would report A4 as the canonical tip, because that's what its parent sortition A had (the new code does not add a row at all for B'). Now, C' is the canonical sortition history, and as with B', the node would incorrectly report that A4 was the canonical Stacks tip on the canonical sortition tip C' (since that was the tip for B', the parent of C'). In the new code, no row would be added to stacks_chain_tips for C'.

The old code is what partially led to the chain stall -- despite the node having processed A1 through A6, the node would incorrectly report A4 as being the Stacks chain tip. The miner code used the buggy code paths to deduce what to build atop, but that has been fixed separately. In the new code, the code paths have been updated so that a sortition is not assigned a canonical Stacks tip unless the corresponding Stacks block reported that sortition as its burn view. This way, the code will correctly report A6 as the canonical Stacks tip for sortition tips B, B', and C'

…ip to the sortition of its burn view. Add unit test coverage to make sure that it works under sortition forks.
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 10, 2026

Codecov Report

❌ Patch coverage is 75.07599% with 82 lines in your changes missing coverage. Please review.
✅ Project coverage is 77.26%. Comparing base (26aa717) to head (22819de).
⚠️ Report is 20 commits behind head on release/3.3.0.0.6.

Files with missing lines Patch % Lines
stackslib/src/chainstate/burn/db/sortdb.rs 75.07% 81 Missing ⚠️
stackslib/src/chainstate/coordinator/mod.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
@@                  Coverage Diff                   @@
##           release/3.3.0.0.6    #6881       +/-   ##
======================================================
+ Coverage              60.93%   77.26%   +16.32%     
======================================================
  Files                    412      412               
  Lines                 218665   218959      +294     
  Branches                 338      338               
======================================================
+ Hits                  133247   169179    +35932     
+ Misses                 85418    49780    -35638     
Files with missing lines Coverage Δ
stacks-node/src/nakamoto_node/miner.rs 83.46% <100.00%> (+0.71%) ⬆️
stackslib/src/chainstate/nakamoto/mod.rs 83.23% <100.00%> (+5.35%) ⬆️
stackslib/src/chainstate/stacks/db/blocks.rs 79.06% <100.00%> (+25.06%) ⬆️
stackslib/src/net/mod.rs 77.42% <ø> (+5.80%) ⬆️
stackslib/src/chainstate/coordinator/mod.rs 76.90% <0.00%> (+2.48%) ⬆️
stackslib/src/chainstate/burn/db/sortdb.rs 78.63% <75.07%> (+26.51%) ⬆️

... and 296 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 339d353...22819de. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jcnelson
Copy link
Copy Markdown
Member Author

Putting this in draft until I can find out why all these integration tests failed (ETA tomorrow).

@jcnelson jcnelson marked this pull request as draft February 10, 2026 15:57
Comment thread stackslib/src/chainstate/burn/db/sortdb.rs
Copy link
Copy Markdown
Contributor

@aaronb-stacks aaronb-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the table needs to track the consensus hash of the stacks blocks (i.e., the consensus hash of their election block). The burn view consensus hash only needs to be used to determine which sortition_id to replace or insert an entry.

Comment thread stackslib/src/chainstate/burn/db/sortdb.rs Outdated
Comment thread stackslib/src/chainstate/burn/db/sortdb.rs
@jcnelson
Copy link
Copy Markdown
Member Author

I think the table needs to track the consensus hash of the stacks blocks (i.e., the consensus hash of their election block). The burn view consensus hash only needs to be used to determine which sortition_id to replace or insert an entry.

Yes, you're right.

@jcnelson jcnelson marked this pull request as ready for review February 11, 2026 22:04
Copy link
Copy Markdown
Contributor

@brice-stacks brice-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this generally looks good, but I believe there is still something wrong. With this fix in place, I'd expect tests::signer::v0::reorg::btc_fork_on_midtenure_accept to fail, since it is asserting on the incorrect memoized block. This test seems to be flaky now, sometimes passing and sometimes failing on the assertion at line 5081. I'm currently not sure whether it is a test problem or an implementation problem.

@jcnelson
Copy link
Copy Markdown
Member Author

I think this generally looks good, but I believe there is still something wrong. With this fix in place, I'd expect tests::signer::v0::reorg::btc_fork_on_midtenure_accept to fail, since it is asserting on the incorrect memoized block. This test seems to be flaky now, sometimes passing and sometimes failing on the assertion at line 5081. I'm currently not sure whether it is a test problem or an implementation problem.

Yes, I agree. I will try and fix this as soon as I am able.

@jcnelson
Copy link
Copy Markdown
Member Author

@brice-stacks I looked into this. The test will need some expansion. Right now, it passes because the tenure produced in the last common ancestor of the Bitcoin forks has only one block in it in the first place.

@jcnelson
Copy link
Copy Markdown
Member Author

The reason this test passes here is because it's racy -- the forked block gets processed before the Bitcoin block. I've locally modified it with a dumb sleep() to reproduce the failure (since it had thus far never failed for me). I now have it failing locally, and can work from there.

@aaronb-stacks aaronb-stacks self-assigned this Mar 2, 2026
jcnelson and others added 3 commits March 2, 2026 10:44
* add tracking global for most-recent/currently broadcasting block for test miner
@aaronb-stacks
Copy link
Copy Markdown
Contributor

aaronb-stacks commented Mar 2, 2026

Okay -- this should now be ready for re-reviews (though I'm now not a valid reviewer).

The scenario the test was trying to trigger was the following:

  1. Miner produces a mid-tenure block, does not broadcast it, and stalls.
  2. While stalled, the next bitcoin block is produced.
  3. Miner unstalls, broadcasts the stalled block.
  4. Miner broadcasts an additional block to tenure-extend over the new bitcoin block.

Then, a bitcoin fork occurs, and with the previous buggy behavior, we assert that:

The canonical stacks tip == the tip at the time of the stall.

The race condition in the test was between (1) and (2) -- if the next bitcoin block is produced before the miner produces a mid-tenure block, then what happens is as follows:

  1. While stalled, the miner produces a tenure-extend block.
  2. Miner unstalls, broadcasts the stalled tenure-extend block.

When the bitcoin fork occurs, the stalled tenure-extend block is correctly not included in the canonical stacks tip. Which means this assertion still passes:

The canonical stacks tip == the tip at the time of the stall.

I avoided this race-condition by adding a tracking global to the test miner: this variable stores the latest block broadcasted or currently being broadcasted. Then, after we apply the stall to the miner, we wait until the current block being broadcasted is a mid-tenure block. Then, we produce a bitcoin block only after we've confirmed that the miner stalled while producing the mid-tenure block.

@jcnelson jcnelson requested a review from aaronb-stacks March 2, 2026 20:09
@jcnelson jcnelson requested a review from brice-stacks March 2, 2026 20:09
@jcnelson
Copy link
Copy Markdown
Member Author

jcnelson commented Mar 2, 2026

I can't approve either

@jcnelson jcnelson requested a review from jacinta-stacks March 2, 2026 20:18
@jcnelson
Copy link
Copy Markdown
Member Author

jcnelson commented Mar 2, 2026

cc @jacinta-stacks @brice-stacks

Comment thread stackslib/src/chainstate/burn/db/sortdb.rs Outdated
jacinta-stacks
jacinta-stacks previously approved these changes Mar 3, 2026
Copy link
Copy Markdown
Contributor

@jacinta-stacks jacinta-stacks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just the one nit and should add a changelog entry.

@jcnelson
Copy link
Copy Markdown
Member Author

jcnelson commented Mar 3, 2026

cc @brice-stacks review please 🙏

@jcnelson jcnelson requested a review from jacinta-stacks March 3, 2026 19:55
jacinta-stacks
jacinta-stacks previously approved these changes Mar 3, 2026
@wileyj wileyj changed the base branch from develop to release/3.3.0.0.6 March 4, 2026 19:46
@wileyj wileyj dismissed jacinta-stacks’s stale review March 4, 2026 19:46

The base branch was changed.

@jcnelson jcnelson merged commit 3719583 into stacks-network:release/3.3.0.0.6 Mar 9, 2026
1 of 2 checks passed
@github-actions
Copy link
Copy Markdown

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions Bot locked as resolved and limited conversation to collaborators Mar 17, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants