Skip to content

Fix that WALBuffer waits for flush instead of file-roll#17628

Open
jt2594838 wants to merge 4 commits intomasterfrom
fix_wal_wait
Open

Fix that WALBuffer waits for flush instead of file-roll#17628
jt2594838 wants to merge 4 commits intomasterfrom
fix_wal_wait

Conversation

@jt2594838
Copy link
Copy Markdown
Contributor

Summary

  • Bug: WALNode.ReqIterator used waitForFlush (signaled on buffer sync) to wait for new WAL data, but data only becomes readable after a WAL file roll — not a buffer flush. This caused the iterator to wake up prematurely (on flush) and then loop/timeout, or miss data entirely.
  • Fix: Introduced a dedicated rollLogWriterCondition in WALBuffer that is signaled only when rollLogWriter creates a new WAL file. Renamed waitForFlushwaitForRollFile across IWALBuffer/WALBuffer/WALNode to reflect the corrected semantics.
  • Additional fix: In waitForNextReady(time, unit), changed timeout || !hasNext() to timeout && !hasNext() — previously a successful (non-timeout) wake-up could still throw TimeoutException if hasNext() happened to return false during a race.
  • Added tryToCollectInsertNodeAndBumpIndex.run() before updating file index so the iterator processes entries from the current file before advancing.

Changed files

  • IWALBuffer.java — renamed interface methods waitForFlushwaitForRollFile
  • WALBuffer.java — added rollLogWriterCondition, signal it in rollLogWriter(), use it in the renamed waitForRollFile methods
  • WALNode.java — updated all call sites, fixed timeout || !hasNext()timeout && !hasNext(), reordered collect call

Test plan

  • Added WALNodeWaitForRollFileTest with 5 tests covering:
    • Timeout when no data is available
    • Flush without roll does not wake the iterator
    • Explicit roll wakes the iterator
    • Concurrent roll wakes a waiting iterator
    • Auto-roll on WAL file size threshold
    • Auto-triggered roll after timeout expiry

Comment thread .idea/icon.png Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't remove

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 9, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
B Maintainability Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@codecov
Copy link
Copy Markdown

codecov Bot commented May 9, 2026

Codecov Report

❌ Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 40.24%. Comparing base (d4be5c8) to head (44d3b74).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
...storageengine/dataregion/wal/buffer/WALBuffer.java 83.33% 1 Missing ⚠️
.../db/storageengine/dataregion/wal/node/WALNode.java 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #17628      +/-   ##
============================================
+ Coverage     40.23%   40.24%   +0.01%     
  Complexity     2554     2554              
============================================
  Files          5177     5177              
  Lines        348880   348884       +4     
  Branches      44624    44624              
============================================
+ Hits         140363   140400      +37     
+ Misses       208517   208484      -33     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants