Skip to content

try to fix clean up deadlock in logpoller#463

Merged
ilija42 merged 2 commits intodevelopfrom
fix/logpoller-close-replay-deadlock
May 7, 2026
Merged

try to fix clean up deadlock in logpoller#463
ilija42 merged 2 commits intodevelopfrom
fix/logpoller-close-replay-deadlock

Conversation

@Tofel
Copy link
Copy Markdown
Contributor

@Tofel Tofel commented May 7, 2026

Made with AI and explained as:

  That said, the fix is mechanically sufficient: recvReplayComplete now also selects on lp.stopCh, so when Close() runs close(lp.stopCh) and then wg.Wait(), every recvReplayComplete goroutine — whether spawned before or after close(stopCh) — observes the closed channel and exits. The previous "send must arrive before
  goroutine starts receiving" timing dependence is gone.

Hoping to solve deadlocks like this one, which are endemic in chainlink repo:

    FAIL    github.com/smartcontractkit/chainlink/v2/core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/logprovider    900.816s

    === Failed
    === FAIL: core/services/llo/telem TestSample (0.00s)
        sampling_test.go:140:
                Error Trace:    /home/runner/_work/chainlink/chainlink/core/services/llo/telem/sampling_test.go:140
                Error:          Should be false
                Test:           TestSample

    === FAIL: core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/logprovider  (0.00s)
    panic: test timed out after 15m0s
        running tests:
            TestIntegration_LogEventProvider_UpdateConfig (14m53s)

or

     === FAIL: core/services/ocr2/plugins/ocr2keeper/evmregistry/v21/logprovider
     (0.00s)
     panic: test timed out after 15m0s
        running tests:
                TestIntegration_LogEventProvider_Backfill (14m52s)

     goroutine 93719 [running]:
     testing.(*M).startAlarm.func1()
        /opt/hostedtoolcache/go/1.26.2/x64/src/testing/testing.go:2802 +0x34b
     created by time.goFunc
        /opt/hostedtoolcache/go/1.26.2/x64/src/time/sleep.go:215 +0x2d

Chainlink PR: smartcontractkit/chainlink#22337
So far 10 successful re-runs:
image

Copilot AI review requested due to automatic review settings May 7, 2026 09:55
@Tofel Tofel requested a review from a team as a code owner May 7, 2026 09:55
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

👋 Tofel, thanks for creating this pull request!

To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team.

Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks!

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 7, 2026

📊 API Diff Results

No changes detected for module github.com/smartcontractkit/chainlink-evm

View full report

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the LogPoller replay-drain goroutine to avoid shutdown deadlocks by allowing it to exit when the service is stopped, even if no replay completion value is ever sent.

Changes:

  • Update recvReplayComplete to select on both lp.replayComplete and lp.stopCh so Close() cannot block indefinitely on wg.Wait().
  • Add inline documentation explaining the shutdown timing/race that could previously lead to a stuck wait.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/logpoller/log_poller.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@ilija42 ilija42 merged commit 46e6a39 into develop May 7, 2026
38 checks passed
@ilija42 ilija42 deleted the fix/logpoller-close-replay-deadlock branch May 7, 2026 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants