Skip to content

Demonstrate failure handling with fallback TX broadcast#1164

Open
zealsham wants to merge 1 commit intopayjoin:masterfrom
zealsham:handle-fallback-tx
Open

Demonstrate failure handling with fallback TX broadcast#1164
zealsham wants to merge 1 commit intopayjoin:masterfrom
zealsham:handle-fallback-tx

Conversation

@zealsham
Copy link
Copy Markdown
Collaborator

@zealsham zealsham commented Oct 16, 2025

As a refrence implementation , payjoin-cli needs to demonstrate proper fallback transaction handling in the case of payjon failure. From what i see , both sender and receiver can broadcast the fall back transaction , This pr starts by handling the sender side of things and would eventually progress to the receiver .

This is still a work in progress

this addresses #1156 .

  • handle v1 sender broadcasting fallback
  • handle v2 sender broadcasting fallback
  • handle v2 receiver broadcasting fallback
  • check that transaction is broadcastable before broadasting
Pull Request Checklist

Please confirm the following before requesting review:

@coveralls
Copy link
Copy Markdown
Collaborator

coveralls commented Oct 16, 2025

Coverage Report for CI Build 24482436745

Warning

No base build found for commit 592f914 on master.
Coverage changes can't be calculated without a base build.
If a base build is processing, this comment will update automatically when it completes.

Coverage: 84.48%

Details

  • Patch coverage: 12 uncovered changes across 1 file (76 of 88 lines covered, 86.36%).

Uncovered Changes

File Changed Covered %
payjoin-cli/src/app/v2/mod.rs 88 76 86.36%

Coverage Regressions

Requires a base build to compare against. How to fix this →


Coverage Stats

Coverage Status
Relevant Lines: 12835
Covered Lines: 10843
Line Coverage: 84.48%
Coverage Strength: 411.48 hits per line

💛 - Coveralls

@zealsham zealsham marked this pull request as draft October 16, 2025 11:59
@arminsabouri arminsabouri mentioned this pull request Oct 28, 2025
2 tasks
@zealsham
Copy link
Copy Markdown
Collaborator Author

zealsham commented Nov 6, 2025

@arminsabouri i implemented error downcast method for v2 in this PR, the downside of this method is having to repeat the same block of code for the the different state of the senders session. I only did it for the "pooling proposal" phase of the sender session, i'll have to do it for the rest of the phases but just want to know your input on this before i head that direction.

i'm also simutaneously working on the "session outcome" based solution with the draft PR from you .

@zealsham zealsham marked this pull request as ready for review January 22, 2026 12:16
@zealsham zealsham changed the title WIP:Demonstrate failure handling with fallback TX broadcast Demonstrate failure handling with fallback TX broadcast Jan 22, 2026
self.process_pj_response(psbt)?;
Ok(())
}
Err(e) => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should be falling back on any Error. We should be doing something similar to what you did below and only fallback on a FatalError -- an irrecoverable protocol deviation.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which of the error types/variant is safe for fallback broadcast .

these two ResponseError::WellKnown() | ResponseError::Unrecognized are the ideal ones IMO. however from https://github.com/bitcoin/bips/blob/master/bip-0078.mediawiki#user-content-Receivers_well_known_errors , i don't think every variant needs fallback broadcasted

Copy link
Copy Markdown
Collaborator

@arminsabouri arminsabouri Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think every variant needs fallback broadcasted

Why not? You could argue unavailable doesn't require a fallback to be broadcasted.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still think we should this @zealsham . Lets double check this against the spec

Comment thread payjoin-cli/src/app/v2/mod.rs Outdated
Err(e) => {
// Check if it's a PersistedError with an API error
if let Some(persisted_error) = e.downcast_ref::<PersistedError<
EncapsulationError,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why downcast specifically to a EncapsulationError? Are there not other Fatal error types?
Regardless, if we want to go this route, we should explore the changes neccecary in the API to make this more ergonomic. and idiomatic
e.is_fatal() or e.should_fallback().

Alternatively we could explore integrating this change into the typestate it self. I.e the error statemachine scrutinizes the error type and decides what to do. Leaving the developer entirely out of it.

If we want to developer to decide how to react then we should explore including Outcome in close() #1181

I would like to talk through the options a bit more before commiting to this route. Let me know your thoughts @zealsham

cc @spacebear21

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think the developer should ultimately decide what to do if a fatal error occurs. from the BIP

At any point, either party may choose to broadcast the fallback transaction described by the Original PSBT instead of proceeding

I believe some implementers might want the sender doing that or the receiver doing that, exposing outcome in close() handles that properly, automatically broadcasting fallback TX doesn't seem like what most devs will want.

Also i'll have to go through #1181 again , as i've lost most of the context.

Comment thread payjoin-cli/src/app/v2/mod.rs Outdated
self.process_pj_response(proposal)?;
return Ok(());
Ok(())
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just have a another match here for a failure case? Instead of scrutinizing the error types above

@arminsabouri
Copy link
Copy Markdown
Collaborator

@zealsham take a look at lightningdevkit/ldk-node#746.
They are doing something similar

@zealsham zealsham force-pushed the handle-fallback-tx branch 2 times, most recently from 48b8064 to 154a49b Compare March 27, 2026 10:43
@zealsham zealsham requested a review from arminsabouri March 30, 2026 11:16
@DanGould DanGould removed their request for review March 31, 2026 13:33
Comment thread payjoin-cli/src/app/v2/mod.rs Outdated
}
_ => return Err(anyhow!("Unexpected sender state")),
SendSession::Closed(outcome) => match outcome {
SenderSessionOutcome::Failure => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could Failure and Cancel be collapsed with different log messages?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this .
"Payjoin session was canceled, broadcasting fallback" sounds appropirate for the cancel outcome ?

Copy link
Copy Markdown
Collaborator

@arminsabouri arminsabouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK. This is moving in the right direction! We should think through how we respond the unknown response errors from the reciever (arguably that could be its own PR if you want).

self.process_pj_response(psbt)?;
Ok(())
}
Err(e) => {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still think we should this @zealsham . Lets double check this against the spec

let receiver_pubkey = pj_param.receiver_pubkey();
let sender_state =
self.db.get_send_session_ids()?.into_iter().find_map(|session_id| {
let (sender_state, persister, fallback_tx) =
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have to pass fallback_tx around the app. IIRC with the persister + session id you can always replay the session and obtain the fallback via the session history.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still think we should this @zealsham . Lets double check this against the spec

Yes, This handles V1 sender flow, How much of a reward it is depends on how much V1 payjoin transaction are occuring. Since payjoin-cli is a reference example, we should demonstrate all cases even if a bulk of payjoin-TX would be v2

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you answered a question from a different thread. Would still appreaciate an answer on this one

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ohh, my mistake @arminsabouri . The fallback is gotten from replay_event just once. i could avoid that in the None branch but that would require me running the replay_session_event again. so as a trade off i kept the fallback in hand to avoid another redundant replay_event_ call.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaying is not a computationally heavy task (there are bounded number of events emitted from the API) and if it reduces code complexity I think this is worth it. To some extent we dont even need a persister object either -- thats a seperate problem.

Something like #1197 would also help in this situtation. i.e the fallback is always available on the top most sender struct. Sender.fallback_tx -> bitcoin::Transaction . For now I think its fine to just replay the events when we need to fallback.

@zealsham zealsham force-pushed the handle-fallback-tx branch from 154a49b to a011f7a Compare April 5, 2026 20:18
let receiver_pubkey = pj_param.receiver_pubkey();
let sender_state =
self.db.get_send_session_ids()?.into_iter().find_map(|session_id| {
let (sender_state, persister, fallback_tx) =
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you answered a question from a different thread. Would still appreaciate an answer on this one

Comment thread payjoin-cli/src/app/v2/mod.rs Outdated
}

#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn process_sender_session_broadcasts_fallback_on_failure() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async fn process_sender_session_broadcasts_fallback_on_failure() {
async fn sender_processes_fallback_on_failure() {

"closed without success"
} else {
"canceled"
};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This label seems unnecessary. Is this log read anywhere?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The label isn't logged , just printed. it should be logged, the label basically tells why the fallback tx was broadcasted, it exist to seperate cancel scenario from failure scenario

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok in that case I would just remove "success" from the closed message.Also as a developer I wouldn't really know the difference between closed / cancled from this message. A better word may be "failed". Or event better mention why it failed. e.g display the error message.

Comment thread payjoin-cli/tests/e2e.rs
Comment on lines +296 to +309
tokio::spawn(async move {
let mut stdout = tokio::io::stdout();
while let Some(line) =
lines.next_line().await.expect("Failed to read line from stdout")
{
stdout
.write_all(format!("{line}\n").as_bytes())
.await
.expect("Failed to write to stdout");
if line.contains("Fallback transaction broadcasted") {
let _ = tx.send(true).await;
break;
}
}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Noticing some duplicate code. The test above has pretty much the same structure. Can we have a common method? Marked as nit. We can postpone for a follow up PR.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay, i'll do the DRY in followup PR

Comment thread payjoin-cli/tests/e2e.rs Outdated
.arg("--port")
.arg("0")
.arg("--pj-endpoint")
.arg("https://dfdkfkdj.codm")
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other tests set this to let pj_endpoint = "https://localhost";

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used a random domain there, i'll set to localhost as anyone can just register that domain in theory

Comment thread payjoin-cli/tests/e2e.rs Outdated

#[cfg(feature = "v1")]
#[tokio::test(flavor = "multi_thread", worker_threads = 4)]
async fn send_receive_payjoin_v1_fallback_transaction_handling() -> Result<(), BoxError> {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
async fn send_receive_payjoin_v1_fallback_transaction_handling() -> Result<(), BoxError> {
async fn payjoin_v1_fallback_transaction_handling() -> Result<(), BoxError> {

Comment thread payjoin-cli/tests/e2e.rs
})
.await?;

assert!(payjoin_sent, "Payjoin send was not detected");
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the sender decide to actually fallback ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so two error type triggers fallback for v1, the reciever pj-ednpoint being an unreachable host triggers the fallback on the sender part

If the http call succeeds but the receiver returns an error response or an invalid psbt ctx.process_response returns Err, and the same fallback broadcast happens.

@DanGould DanGould mentioned this pull request Apr 14, 2026
2 tasks
@zealsham zealsham force-pushed the handle-fallback-tx branch from a011f7a to e15bbe5 Compare April 15, 2026 22:20
As a refrence implementation , payjoin-cli needs to demonstrate
proper fallback transaction handling in the case of payjon failure.
From what i see , both sender and receiver can broadcast the fall
back transaction , This pr starts by handling the sender side
of things and would eventually progress to the receiver .

this addresses payjoin#1156

This is mostly sender fallback handling.

Broadcast on Sessionoutcome
@zealsham zealsham force-pushed the handle-fallback-tx branch from e15bbe5 to 745a1f4 Compare April 15, 2026 22:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants