Locked splice bug fixes during channel reestablishment#4624
Conversation
In most cases, we end up sending our `splice_locked` either implicitly during reestablishment via `ChannelReestablish::my_current_funding_locked`, or explicitly after reestablishment. However, we did not consider that it's possible for the node to be notified of the splice confirmation after connecting to our peer but prior to reestablishing our channel. In such cases, we must allow the `splice_locked` to be sent, since the `my_current_funding_locked` sent in the `channel_reestablish` to the counterparty still points to the pre-splice txid. Found by the `chanmon_consistency` fuzz target.
Upon channel reestablishment, we free our holding cells to send any pending updates to our peer. If we happened to implicitly lock a pending splice during reestablishment, we want to make sure any updates we send after the fact are considering the new channel state (post-splice), even if the update was queued while the splice was still pending. Therefore, we must always handle the inferred `splice_locked` first. Found by the `chanmon_consistency` fuzz target.
|
👋 Thanks for assigning @jkczyz as a reviewer! |
|
I've thoroughly reviewed the entire PR diff, including both production code changes and all test code. Here is my assessment: No issues found. Both fixes are correct:
The |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4624 +/- ##
==========================================
- Coverage 86.59% 86.59% -0.01%
==========================================
Files 159 159
Lines 110420 110426 +6
Branches 110420 110426 +6
==========================================
+ Hits 95619 95624 +5
+ Misses 12267 12265 -2
- Partials 2534 2537 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Doesn't the spec require that we have exchanged channel_reestablish before sending any other messages?
IIUC, this would let us enqueue splice_locked before receiving channel_reestablish from our peer. Maybe in practice they should have already have sent it, but we may not have processed it yet.
| // A reestablish may infer a missed `splice_locked`; apply it before freeing holding | ||
| // cells so we don't generate commitment updates against stale splice state. | ||
| let holding_cell_res = { | ||
| let per_peer_state = self.per_peer_state.read().unwrap(); | ||
| let peer_state_mutex = per_peer_state.get(counterparty_node_id).ok_or_else(|| { | ||
| MsgHandleErrInternal::unreachable_no_such_peer(counterparty_node_id, msg.channel_id) | ||
| })?; | ||
| let mut peer_state_lock = peer_state_mutex.lock().unwrap(); | ||
| self.check_free_peer_holding_cells(&mut *peer_state_lock) | ||
| }; | ||
| self.handle_holding_cell_free_result(holding_cell_res); |
There was a problem hiding this comment.
Should we refactor the guts of internal_splice_locked into a separate function so we don't take the peer state lock three times?
This PR includes fixes for two bugs/edge cases when handling a locked splice during reestablishment. These were found by the
chanmon_consistencyfuzz target.