Skip to content

Conversation

@tosynthegeek
Copy link
Contributor

This PR attempts to improve retry in the bitcoind RPC synchronization loop from #587 by replacing the existing linear backoff with a proper exponential backoff strategy. We could also add a maximum delay to prevent excessively long waits, but I am not sure if it would really help.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Jun 26, 2025

👋 Thanks for assigning @tnull as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

CI is unhappy though, please check that the code compiles before pushing.

src/chain/mod.rs Outdated
log_error!(logger, "Failed to synchronize chain listeners: {:?}", e);
tokio::time::sleep(Duration::from_secs(CHAIN_POLLING_INTERVAL_SECS))
.await;
backoff_delay_multiplier += 1;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should let the backoff delay grow infinitely. Please include a sane upper bound.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, I am adding a max backoff delay of 300 secs for this

src/chain/mod.rs Outdated
} else {
log_error!(
logger,
"Failed to synchronize chain listeners: {:?}, e"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't compile. Also, as discussed elsewhere, we at least need to error out. But technically, we should keep retrying currently as otherwise we'd need to add panic here as permanently being unable to connect to the chain source is an unrecoverable error.

Copy link
Contributor Author

@tosynthegeek tosynthegeek Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't compile. Also, as discussed elsewhere, we at least need to error out. But technically, we should keep retrying currently as otherwise we'd need to add panic here as permanently being unable to connect to the chain source is an unrecoverable error.

@tnull As you suggested that panicking isnt best and not being able to error out I stuck with retrying but with the delay of 300 secs. Exponential backoff from CHAIN_POLLING_INTERVAL_SECS was only introduced for transient errors

@tosynthegeek
Copy link
Contributor Author

Thanks!

CI is unhappy though, please check that the code compiles before pushing.

Yeah, just saw that, would check it out now..

Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also make sure to always provide rationale for the change in the commit message, and format it properly (heading shouldn't be larger than one line). Feel free to refer to https://cbea.ms/git-commit/ on how to write good commit messages.

Previously, chain synchronization failures would retry immediately
without any delay, which could lead to tight retry loops and high
CPU usage during failures.

This change introduces exponential backoff for transient
errors, starting at 2 seconds and doubling each time up to a maximum
of 300 seconds. Persistent errors also now delay retries by the
maximum backoff duration to prevent rapid loops while maintaining
eventual recovery.

Fixes lightningdevkit#587
Copy link
Collaborator

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@tnull tnull merged commit 775e0db into lightningdevkit:main Jun 30, 2025
20 of 24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants