Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 18 additions & 2 deletions test/functional/feature_governance_cl.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,16 +142,32 @@ def run_test(self):
self.bump_mocktime(156)
self.generate(self.nodes[0], 1, sync_fun=lambda: self.sync_blocks(self.nodes[0:5]))

self.log.info("Wait for governance module to catch up with block updates")
tip_height = self.nodes[0].getblockcount()
expected_msg = f'CGovernanceManager::UpdatedBlockTip -- nCachedBlockHeight: {tip_height}'

def governance_tip_updated(node):
with open(node.debug_log_path, encoding='utf-8') as dl:
seek_pos = node.debug_log_bytes() - 100 * 1024 # read the last 100 KiB only
dl.seek(seek_pos if seek_pos > 0 else 0)
debug_log_part = dl.read()
return expected_msg in debug_log_part

for node in self.nodes[0:5]:
self.wait_until(lambda node=node: governance_tip_updated(node))

self.log.info("Bump mocktime to trigger governance cleanup")
for delta, expected in (
(5 * 60, ['UpdateCachesAndClean -- Governance Objects:']), # mark old triggers for deletion
(5 * 60, ['CleanAndRemoveTriggers -- Removing trigger object']), # mark old triggers for deletion
(10 * 60, ['UpdateCachesAndClean -- Governance Objects: 0']), # deletion after delay
):
self.mocktime += delta
for node in self.nodes:
for node in self.nodes[0:5]:
with node.assert_debug_log(expected_msgs=expected):
node.setmocktime(self.mocktime)
node.mockscheduler(delta)
self.nodes[5].setmocktime(self.mocktime)
self.nodes[5].mockscheduler(delta)
Comment on lines 159 to +170
Copy link

@coderabbitai coderabbitai bot Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

5‑minute expectation on "CleanAndRemoveTriggers -- Removing trigger object" is too strict and appears flaky.

The new assertion:

(5 * 60, ['CleanAndRemoveTriggers -- Removing trigger object']),  # mark old triggers for deletion

assumes that the first cleanup pass after the additional block + 5‑minute mocktime jump will always log "Removing trigger object". The combined log in the earlier review, however, shows a valid run where at this point CleanAndRemoveTriggers logs:

  • CSuperblock::IsExpired -- nBlockHeight = 260, nExpirationBlock = 280
  • CleanAndRemoveTriggers -- NOT marked for removal

and never logs "CleanAndRemoveTriggers -- Removing trigger object" in that pass, while UpdateCachesAndClean -- Governance Objects: 1 ... is present.

Your current changes (including the new UpdatedBlockTip wait at lines 145‑158) don’t change the underlying C++ logic that decides whether a trigger is “marked for removal” or “NOT marked for removal”; they only affect scheduling/catch‑up. So the scenario from the log remains possible, and this assertion will still intermittently fail even when governance behavior is correct.

The node‑5 handling here looks good: you still run setmocktime/mockscheduler on the isolated node but exclude it from assert_debug_log, which matches the intent and the earlier learning that scheduler‑based cleanup must still run on isolated nodes.

To make the test robust while keeping your new synchronization and node‑range changes, I’d recommend relaxing the 5‑minute expectation back to a more generic cleanup log (as before), and rely on the 10‑minute "Governance Objects: 0" plus the final RPC checks to assert actual removal:

-        for delta, expected in (
-            (5 * 60, ['CleanAndRemoveTriggers -- Removing trigger object']),  # mark old triggers for deletion
-            (10 * 60, ['UpdateCachesAndClean -- Governance Objects: 0']),  # deletion after delay
-        ):
+        for delta, expected in (
+            # First run: ensure governance cleanup runs; we only care that UpdateCachesAndClean fired.
+            (5 * 60, ['UpdateCachesAndClean -- Governance Objects:']),  # mark old triggers for deletion
+            (10 * 60, ['UpdateCachesAndClean -- Governance Objects: 0']),  # deletion after delay
+        ):

This keeps the log‑based signal that cleanup ran during the first pass, avoids depending on the precise “Removing trigger object” branch, and should eliminate the intermittent failures you’re seeing.

Copy link
Author

@UdjinM6 UdjinM6 Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on Removing trigger object log line is intentional. Having NOT marked for removal line in logs is a sign of sync issues (scheduler running before block heigh is updated in governance module) 27ac9f2 is designed to solve.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@UdjinM6, thank you for the clarification! I now understand the intent. The synchronization step you added (waiting for UpdatedBlockTip at the current tip height) ensures that the governance module has processed the latest block before the scheduler-based cleanup runs. With that synchronization in place, the strict check for "Removing trigger object" is the correct verification that the fix works as intended. If "NOT marked for removal" were to appear despite the synchronization, that would indeed indicate a problem that the test should catch.

The approach makes sense: fix the race condition with proper synchronization, then verify the fix with a strict assertion. My concern was misplaced because I didn't account for the fact that the new governance_tip_updated wait should eliminate the timing issue.

🐰💡


✏️ Learnings added
Learnt from: UdjinM6
Repo: dashpay/dash PR: 7051
File: test/functional/feature_governance_cl.py:159-170
Timestamp: 2025-12-09T20:40:38.414Z
Learning: In feature_governance_cl.py, before running governance cleanup with mockscheduler, the test must wait for the governance module to catch up with the latest block height by checking for the "CGovernanceManager::UpdatedBlockTip -- nCachedBlockHeight" log message. This synchronization ensures that subsequent cleanup operations see the correct block height and produce deterministic results (e.g., "CleanAndRemoveTriggers -- Removing trigger object" instead of "NOT marked for removal").

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: UdjinM6
Repo: dashpay/dash PR: 6926
File: test/functional/feature_governance_cl.py:0-0
Timestamp: 2025-10-28T08:54:00.392Z
Learning: In Dash tests, the scheduler (mockscheduler) operates independently of network state. Isolated nodes should still run scheduler-based cleanup processes like governance cleanup, even if they have different state due to network isolation.

Learnt from: CR
Repo: dashpay/dash PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:41:22.457Z
Learning: Applies to src/governance/**/*.{cpp,h} : Governance implementation must support governance objects (proposals, triggers, superblock management) and on-chain voting with tallying

Learnt from: kwvg
Repo: dashpay/dash PR: 6761
File: src/chainlock/signing.cpp:247-250
Timestamp: 2025-07-29T14:32:48.369Z
Learning: In PR #6761, kwvg acknowledged a null pointer check issue in ChainLockSigner::Cleanup() method but deferred it to follow-up, consistent with the pattern of avoiding scope creep in refactoring PRs.

Learnt from: kwvg
Repo: dashpay/dash PR: 6718
File: test/functional/test_framework/test_framework.py:2102-2102
Timestamp: 2025-06-09T16:43:20.996Z
Learning: In the test framework consolidation PR (#6718), user kwvg prefers to limit functional changes to those directly related to MasternodeInfo, avoiding scope creep even for minor improvements like error handling consistency.

Learnt from: UdjinM6
Repo: dashpay/dash PR: 6786
File: ci/test/04_install.sh:99-101
Timestamp: 2025-08-01T07:46:37.840Z
Learning: In backport PRs like #6786, UdjinM6 prefers to defer non-critical fixes (such as shell command expansion issues) to separate commits/PRs to maintain focus on the primary backport objectives, consistent with the project's pattern of avoiding scope creep.


# Confirm in RPC
for node in self.nodes:
Expand Down
Loading