Demonstrate failure handling with fallback TX broadcast#1164
Demonstrate failure handling with fallback TX broadcast#1164zealsham wants to merge 1 commit intopayjoin:masterfrom
Conversation
Coverage Report for CI Build 24482436745Warning No base build found for commit Coverage: 84.48%Details
Uncovered Changes
Coverage RegressionsRequires a base build to compare against. How to fix this → Coverage Stats
💛 - Coveralls |
9597ed0 to
b25f2a6
Compare
|
@arminsabouri i implemented error downcast method for v2 in this PR, the downside of this method is having to repeat the same block of code for the the different state of the senders session. I only did it for the "pooling proposal" phase of the sender session, i'll have to do it for the rest of the phases but just want to know your input on this before i head that direction. i'm also simutaneously working on the "session outcome" based solution with the draft PR from you . |
3d44e43 to
44354f1
Compare
59fd8f2 to
1c4175e
Compare
1c4175e to
069fa34
Compare
| self.process_pj_response(psbt)?; | ||
| Ok(()) | ||
| } | ||
| Err(e) => { |
There was a problem hiding this comment.
I'm not sure we should be falling back on any Error. We should be doing something similar to what you did below and only fallback on a FatalError -- an irrecoverable protocol deviation.
There was a problem hiding this comment.
which of the error types/variant is safe for fallback broadcast .
these two ResponseError::WellKnown() | ResponseError::Unrecognized are the ideal ones IMO. however from https://github.com/bitcoin/bips/blob/master/bip-0078.mediawiki#user-content-Receivers_well_known_errors , i don't think every variant needs fallback broadcasted
There was a problem hiding this comment.
i don't think every variant needs fallback broadcasted
Why not? You could argue unavailable doesn't require a fallback to be broadcasted.
There was a problem hiding this comment.
Still think we should this @zealsham . Lets double check this against the spec
| Err(e) => { | ||
| // Check if it's a PersistedError with an API error | ||
| if let Some(persisted_error) = e.downcast_ref::<PersistedError< | ||
| EncapsulationError, |
There was a problem hiding this comment.
Why downcast specifically to a EncapsulationError? Are there not other Fatal error types?
Regardless, if we want to go this route, we should explore the changes neccecary in the API to make this more ergonomic. and idiomatic
e.is_fatal() or e.should_fallback().
Alternatively we could explore integrating this change into the typestate it self. I.e the error statemachine scrutinizes the error type and decides what to do. Leaving the developer entirely out of it.
If we want to developer to decide how to react then we should explore including Outcome in close() #1181
I would like to talk through the options a bit more before commiting to this route. Let me know your thoughts @zealsham
cc @spacebear21
There was a problem hiding this comment.
i think the developer should ultimately decide what to do if a fatal error occurs. from the BIP
At any point, either party may choose to broadcast the fallback transaction described by the Original PSBT instead of proceeding
I believe some implementers might want the sender doing that or the receiver doing that, exposing outcome in close() handles that properly, automatically broadcasting fallback TX doesn't seem like what most devs will want.
Also i'll have to go through #1181 again , as i've lost most of the context.
| self.process_pj_response(proposal)?; | ||
| return Ok(()); | ||
| Ok(()) | ||
| } |
There was a problem hiding this comment.
Can we just have a another match here for a failure case? Instead of scrutinizing the error types above
|
@zealsham take a look at lightningdevkit/ldk-node#746. |
48b8064 to
154a49b
Compare
| } | ||
| _ => return Err(anyhow!("Unexpected sender state")), | ||
| SendSession::Closed(outcome) => match outcome { | ||
| SenderSessionOutcome::Failure => { |
There was a problem hiding this comment.
could Failure and Cancel be collapsed with different log messages?
There was a problem hiding this comment.
Thanks for catching this .
"Payjoin session was canceled, broadcasting fallback" sounds appropirate for the cancel outcome ?
arminsabouri
left a comment
There was a problem hiding this comment.
Approach ACK. This is moving in the right direction! We should think through how we respond the unknown response errors from the reciever (arguably that could be its own PR if you want).
| self.process_pj_response(psbt)?; | ||
| Ok(()) | ||
| } | ||
| Err(e) => { |
There was a problem hiding this comment.
Still think we should this @zealsham . Lets double check this against the spec
| let receiver_pubkey = pj_param.receiver_pubkey(); | ||
| let sender_state = | ||
| self.db.get_send_session_ids()?.into_iter().find_map(|session_id| { | ||
| let (sender_state, persister, fallback_tx) = |
There was a problem hiding this comment.
Do we have to pass fallback_tx around the app. IIRC with the persister + session id you can always replay the session and obtain the fallback via the session history.
There was a problem hiding this comment.
Still think we should this @zealsham . Lets double check this against the spec
Yes, This handles V1 sender flow, How much of a reward it is depends on how much V1 payjoin transaction are occuring. Since payjoin-cli is a reference example, we should demonstrate all cases even if a bulk of payjoin-TX would be v2
There was a problem hiding this comment.
I think you answered a question from a different thread. Would still appreaciate an answer on this one
There was a problem hiding this comment.
ohh, my mistake @arminsabouri . The fallback is gotten from replay_event just once. i could avoid that in the None branch but that would require me running the replay_session_event again. so as a trade off i kept the fallback in hand to avoid another redundant replay_event_ call.
There was a problem hiding this comment.
Replaying is not a computationally heavy task (there are bounded number of events emitted from the API) and if it reduces code complexity I think this is worth it. To some extent we dont even need a persister object either -- thats a seperate problem.
Something like #1197 would also help in this situtation. i.e the fallback is always available on the top most sender struct. Sender.fallback_tx -> bitcoin::Transaction . For now I think its fine to just replay the events when we need to fallback.
154a49b to
a011f7a
Compare
| let receiver_pubkey = pj_param.receiver_pubkey(); | ||
| let sender_state = | ||
| self.db.get_send_session_ids()?.into_iter().find_map(|session_id| { | ||
| let (sender_state, persister, fallback_tx) = |
There was a problem hiding this comment.
I think you answered a question from a different thread. Would still appreaciate an answer on this one
| } | ||
|
|
||
| #[tokio::test(flavor = "multi_thread", worker_threads = 4)] | ||
| async fn process_sender_session_broadcasts_fallback_on_failure() { |
There was a problem hiding this comment.
| async fn process_sender_session_broadcasts_fallback_on_failure() { | |
| async fn sender_processes_fallback_on_failure() { |
| "closed without success" | ||
| } else { | ||
| "canceled" | ||
| }; |
There was a problem hiding this comment.
This label seems unnecessary. Is this log read anywhere?
There was a problem hiding this comment.
The label isn't logged , just printed. it should be logged, the label basically tells why the fallback tx was broadcasted, it exist to seperate cancel scenario from failure scenario
There was a problem hiding this comment.
Ok in that case I would just remove "success" from the closed message.Also as a developer I wouldn't really know the difference between closed / cancled from this message. A better word may be "failed". Or event better mention why it failed. e.g display the error message.
| tokio::spawn(async move { | ||
| let mut stdout = tokio::io::stdout(); | ||
| while let Some(line) = | ||
| lines.next_line().await.expect("Failed to read line from stdout") | ||
| { | ||
| stdout | ||
| .write_all(format!("{line}\n").as_bytes()) | ||
| .await | ||
| .expect("Failed to write to stdout"); | ||
| if line.contains("Fallback transaction broadcasted") { | ||
| let _ = tx.send(true).await; | ||
| break; | ||
| } | ||
| } |
There was a problem hiding this comment.
Nit: Noticing some duplicate code. The test above has pretty much the same structure. Can we have a common method? Marked as nit. We can postpone for a follow up PR.
There was a problem hiding this comment.
okay, i'll do the DRY in followup PR
| .arg("--port") | ||
| .arg("0") | ||
| .arg("--pj-endpoint") | ||
| .arg("https://dfdkfkdj.codm") |
There was a problem hiding this comment.
The other tests set this to let pj_endpoint = "https://localhost";
There was a problem hiding this comment.
I used a random domain there, i'll set to localhost as anyone can just register that domain in theory
|
|
||
| #[cfg(feature = "v1")] | ||
| #[tokio::test(flavor = "multi_thread", worker_threads = 4)] | ||
| async fn send_receive_payjoin_v1_fallback_transaction_handling() -> Result<(), BoxError> { |
There was a problem hiding this comment.
| async fn send_receive_payjoin_v1_fallback_transaction_handling() -> Result<(), BoxError> { | |
| async fn payjoin_v1_fallback_transaction_handling() -> Result<(), BoxError> { |
| }) | ||
| .await?; | ||
|
|
||
| assert!(payjoin_sent, "Payjoin send was not detected"); |
There was a problem hiding this comment.
How does the sender decide to actually fallback ?
There was a problem hiding this comment.
so two error type triggers fallback for v1, the reciever pj-ednpoint being an unreachable host triggers the fallback on the sender part
If the http call succeeds but the receiver returns an error response or an invalid psbt ctx.process_response returns Err, and the same fallback broadcast happens.
a011f7a to
e15bbe5
Compare
As a refrence implementation , payjoin-cli needs to demonstrate proper fallback transaction handling in the case of payjon failure. From what i see , both sender and receiver can broadcast the fall back transaction , This pr starts by handling the sender side of things and would eventually progress to the receiver . this addresses payjoin#1156 This is mostly sender fallback handling. Broadcast on Sessionoutcome
e15bbe5 to
745a1f4
Compare
As a refrence implementation , payjoin-cli needs to demonstrate proper fallback transaction handling in the case of payjon failure. From what i see , both sender and receiver can broadcast the fall back transaction , This pr starts by handling the sender side of things and would eventually progress to the receiver .
This is still a work in progress
this addresses #1156 .
Pull Request Checklist
Please confirm the following before requesting review:
AI
in the body of this PR.