Non-blocking Interface for Payjoin State Machine #1446
Non-blocking Interface for Payjoin State Machine #1446xstoicunicornx wants to merge 6 commits intopayjoin:masterfrom
Conversation
spacebear21
left a comment
There was a problem hiding this comment.
concept ACK. I mostly reviewed the v1 state machine before realizing those same changes are applied in v2 as well, so all my comments apply to the v2 changes too. I'm actually not sure we need to support an async-compatible interface for v1 at all? It's going to result in a lot of duplication and afaik v1 is a) synchronous protocol and b) getting deprecated.
payjoin/src/core/receive/v1/mod.rs
Outdated
| /// Use this for asynchronously checking whether inputs are owned | ||
| pub fn process_inputs_owned_result( | ||
| self, | ||
| is_owned: Vec<(ScriptBuf, bool)>, |
There was a problem hiding this comment.
I wonder if it would be simpler and more explicit to pass owned_inputs: Vec<ScriptBuf>?
There was a problem hiding this comment.
I actually had initially implemented it that way but to me it seemed more explicit to force returning a result for every script. This way if somehow there was a bug where a script was erroneously omitted it wouldn't erroneously transition to the next state, it would throw an implementation error.
There was a problem hiding this comment.
The state transition logic is proceed if none of the inputs are owned by the receiver, as this is the fallback transaction.
I think the simplest API is therefore one method for the success path, asserting that none are owned, and another that fails but requires the list inputs, to construct an Err result. This can also be expressed as s single Option argument to a single method.
As for how to specify the inputs which are owned, doing so by a list of owned scripts which might not be in 1:1 correspondence seems to make things a bit more complicated, requiring allocations etc.
The most misuse resistant way I can think of encoding this (which kind of obviates my previous suggestion) is to return some struct which in addition to containing the script also has a private field for the input index. This struct could then be converted into a judgement by a method which consumes it, soemthing like pub is_owned(self, bool) -> TaggedInputReference where struct TaggedInputReference { is_owned: bool, index: usize }. The index field can therefore be trusted, it'd be guaranteed unique (enforced by linear types, and no Clone impl for this handle) so it'd just need to check that you got as many as required and that all of them were explicitly marked not known.
There was a problem hiding this comment.
I just pushed a temporary commit with my attempted implementation of this suggestion. It has only been implemented for MaybeInputsOwned but should be good enough to spot anything where my understanding is off.
The constraints under which I operated for designing this were:
- Don't change anything about existing interface
- Don't have any public facing interface use
Vec<_>
See 837ff11
The Validator itself does employ Vec in some of it's interface, however none of those pieces of the interface are public facing so should be able to remove them in the future without affecting the public interface. The reason I resorted to using a Validator to store validation information with Vec was because I was trying to not keep any additional state in MaybeInputsOwned. Right now Validator is generic but let me know if I should implement it to be specific to each type state (and if you have any tips on how to do this leveraging the generic Validator).
validate_inputs_not_owned (in payjoin/src/core/receive/mod.rs) walks through usage of a Validator.
Would love feedback on how to improve this design or whether any of the constraints I'm operating under are incorrect or missing.
There was a problem hiding this comment.
i think this can be done more simply, but not as simply as i initially hoped, and i'm doubting whether my suggestion was workable at all.
this example/mockup code (shifts the responsibility for collect() to the caller, so the caller can collect without heap allocations (e.g. into a fixed size array of Option<TaggedInputReference<IsOwned>>).
i think it should be possible to remove the requirement to do this collection, i.e. delete the line let tagged_inputs: Vec<_> = tagged_inputs.collect(), but i am struggling to think of how to do this without unsafe and some kind of fugly custom iterator that allows the Receiver to not be borrowed by the iterator.
unfortunately the argument for safety in this case is incomplete, because although process_inputs_owned takes ownership of Receiver then iterating over its contents in the body is possible, it's not possible to guarantee that Receiver isn't dropped before the iterator, and then the iterator gets enumerated.
perhaps a way to do that safely is to have a custom Iterator like abstraction (wrapping an iterator) that which can only be consumed inside process_inputs_owned (i.e. there's no way to call .iter() or .into_iter() because it's wrapped in a newtype, the wrapper only allows calling .map() on it, and i guess only map). while this is safe, it's just a roundabout way of reintroducing the same callback focused API, because it no longer allows the caller to collect at their discretion, and this would mean that we'd need to have both a blocking and an async variant.
if the function is blocking, then it could be given (indirectly, via the map iterator) so that eventually process_inputs_owned actually invokes it when consuming the iterator, with no temporary allocations but effectively just calling the function on one input at a time and erroring as appropriate.
but to avoid allocation a async version of process_inputs_owned could just get an iterator of futures and await each one in turn, avoiding any heap allocations like the blocking variant.
so i guess the question is can the valid ownership be expressed so that an iterator is can still be consumed by the caller (so they can collect or not as they choose), without duplicating a lot of boilerplate for blocking and async callback variants and while still allowing processing without collection.
// mock/dummy types:
#[derive(Clone)]
struct ScriptBuf(usize); // just something comparable
struct Input { script: ScriptBuf } // represents input with known prevout script, bypassing outpoint etc
struct PsbtContext {
inputs: Vec<Input>,
}
struct InputReferenceWithScript {
index: usize,
script: ScriptBuf
}
// maybe generic makes sense to abstract over Owned/Seen in a typesafe way
struct TaggedInputReference<T> {
index: usize,
tag: T,
}
struct IsOwned {
is_owned: bool
}
impl InputReferenceWithScript {
fn mark_owned(self, is_owned: bool) -> TaggedInputReference<IsOwned> {
TaggedInputReference {
index: self.index,
tag: IsOwned { is_owned },
}
}
}
struct Receiver {
psbt: PsbtContext,
}
impl Receiver {
fn inputs_with_scripts(&self) -> impl Iterator<Item = InputReferenceWithScript> {
// analogous to input_pairs()
self.psbt.inputs.iter().enumerate().map(|(index, txin)| {
let script = txin.script.clone();
InputReferenceWithScript { index, script }
})
}
fn process_inputs_owned(self, tagged_inputs: impl IntoIterator<Item = TaggedInputReference<IsOwned>>) {
let _: Vec<_> = tagged_inputs.into_iter().collect();
todo!()
}
}
fn main() {
let receiver = Receiver { psbt: PsbtContext { inputs: vec![Input{ script: ScriptBuf(1) }, Input { script: ScriptBuf(2) } ] } };
let pairs = receiver.inputs_with_scripts();
let tagged_inputs = pairs.map(|txin| txin.mark_owned(true));
// FIXME: can this be removed?
let tagged_inputs: Vec<_> = tagged_inputs.collect();
receiver.process_inputs_owned(tagged_inputs);
}There was a problem hiding this comment.
Many thanks for the thoughtful feedback, and the mockup code was extremely helpful.
Just pushed another temporary commit with rough implementation that follows this pattern incase you want to validate I am on the right track. Few things not implemented:
- Type specific tag
- async
apply v1andv2interfaces
Few things I did want to get general feedback on though:
- Conceptual question: why use
Iteratorwhen providing theValidatorandIntoIteratorwhen returning theValidator? Is it just to avoid having to take it as mutable when returned? - I removed the
indexfield in theValidatorReferencebecause it was flagged as dead code, is there something we should be using this for? - I added a lot of
#[allow(private_interfaces)]and#[allow(private_bounds)], are these appropriate? Seemed like we want to restrict what is made public.
There was a problem hiding this comment.
Many thanks for the thoughtful feedback, and the mockup code was extremely helpful.
Just pushed another temporary commit with rough implementation that follows this pattern incase you want to validate I am on the right track. Few things not implemented:
- Type specific tag
- async
applyv1andv2interfacesFew things I did want to get general feedback on though:
- Conceptual question: why use
Iteratorwhen providing theValidatorandIntoIteratorwhen returning theValidator? Is it just to avoid having to take it as mutable when returned?
tl;dr - making function type signatures that are "wider" types in the arguments and "narrower" in the return types helps keep things composable.
TMI explanation follows...
Taking that as an argument allows an iterator (every iterator implements into iterator) or any iterable collection to be used in the argument position. in this case it's IntoIterator<T> as opposed to IntoIterator<&T> so the collection would be consumed, but the goal here is to consume the individual references not so much the collection itself (e.g. a reference to a collection over T is generally IntoIterator<&T>).
Returning Iterator on the other hand is a narrower type than it could be, i.e. IntoIterator would allow returning any kind of enumerable collection internally and changing that without changing the function type, but this affords no additional freedoms to the caller, and just imposes a requirement to call into_iter().
impl in argument position is basically a short hand for making the function generic over an anonymous type parameter with a trait bound, and in return position it allows this contravariant type in lieu of the concrete type.
the rationale is related to but not exactly the same as the adage of "be co-variant in your inputs/arguments, contravariant in your output/return value" or however that goes, which if memory serves me right is basically a generalization of the Liskov substitution principle to also take into account return value polymorphism. in rust it matters less because there's no inheritance to introduce the problems that this habit is good at solving, but the logic still applies to super/sub traits, and in this particular example IntoIter is a supertrait of Iterator.
- I removed the
indexfield in theValidatorReferencebecause it was flagged as dead code, is there something we should be using this for?
yeah that's the "sealed" index into the inputs list that can just be read off the individual values, it can be used to index into the internal storage in the psbt, and it's guaranteed to be unique because it's generated from an enumeration and the caller can't modify it, so if you receive as many tagged references as there are elements then you know everything is fully defined as you originally argued was safer, but checking for which particular input was not owned is just a matter of scanning over these one by one, there's no need to use a hashmap (which can get confusing if two inputs have the same script, for example)
edit: note that i'm ignoring the possibility of taking references from one receiver state and giving them to a different receiver state here, though that can also be enforced by converting &self to a raw pointer that is saved in another field of the input reference that is copied into the tagged input reference, and then comparing that when processing the result, the conversion from ref to raw pointer can be done without unsafe but to be guaranteed correct i believe that requires pinning. i also just asked one of the stochastic parrots, and it seems to correctly suggest the "brand pattern" which iiuc achieves the same effect at compile time and without the need for pinning: https://internals.rust-lang.org/t/design-safe-collection-api-with-compile-time-reference-stability-in-rust/20285
- I added a lot of
#[allow(private_interfaces)]and#[allow(private_bounds)], are these appropriate? Seemed like we want to restrict what is made public.
let me get back to you on that one (i'll read the code and make sure i understand the implications of this lint tomorrow), but my gut instinct based on a superficial understanding is that there's probably a way to refactor so that that isn't needed
payjoin/src/core/receive/v1/mod.rs
Outdated
| /// Use this for asynchronously checking whether inputs have been seen | ||
| pub fn process_inputs_seen_result( | ||
| self, | ||
| is_known: Vec<(&OutPoint, bool)>, |
There was a problem hiding this comment.
Similar point here, known_inputs: Vec<&OutPoint>
| /// function sets that parameter to None so that it is ignored in subsequent steps of the | ||
| /// receiver flow. This protects the receiver from accidentally subtracting fees from their own | ||
| /// outputs. | ||
| #[cfg_attr(not(feature = "v1"), allow(dead_code))] |
There was a problem hiding this comment.
Unrelated to this PR but this seems odd.. Why is this "not v1" feature gate here in the v1 module??
There was a problem hiding this comment.
I thought it was saying that that if v1 is not being used allow this as dead code, which seemed logical to me. But I am not really familiar with rust conventions on how to handle features that aren't used.
payjoin/src/core/receive/v1/mod.rs
Outdated
| /// is different from the entity that has access to the private keys, | ||
| /// so the PSBT to sign must be accessible to such implementers. | ||
| pub fn psbt_to_sign(&self) -> Psbt { self.psbt_context.payjoin_psbt.clone() } | ||
| /// Use this for asynchronously finalizing psbt |
There was a problem hiding this comment.
I think an appropriate docstring would encapsulate any valid use case (different entity having access to the key as is the case with Liana, async compatibility, etc.). The callback method might not work for a variety of reasons.
There was a problem hiding this comment.
Thats true. Does this comment apply only to psbt_to_sign or to all the methods I have added? I kind of think its the latter.
Do you have any suggestions on a simple-ish but clear term to encapsulate all the valid use cases? Or is repetitive verbosity ok?
There was a problem hiding this comment.
Wouldn't a different entity having access to the key imply some sort of asynchronous flow though?
There was a problem hiding this comment.
Sorry for the late reply! You're right that it implies some kind of asynchronous flow but that doesn't necessarily mean "async" in the parallelism/concurrency sense. For example in the Liana case there are no instances of await/async when finalizing the proposal, but (iiuc) the issue there is that the Liana signer requires a mutable version of the Psbt to manually create and add the witness to it.
I think the phrasing you're using now ("non-blocking") is good, though in each typestate I think it would help to be more explicit about the relationship between the non-blocking method and the data required to compute the result. For example:
/// The Payjoin proposal PSBT that the receiver needs to sign.
/// Once it is signed, use [`finalize_signed_proposal`] to validate and proceed to the next state.
pub fn psbt_to_sign(&self) -> Psbt {
self.psbt_context.payjoin_psbt_without_sender_signatures()
}
/// Process receiver signed payjoin proposal
/// Use this for non-blocking finalization of psbt. First extract [`psbt_to_sign`] and sign it, then
/// pass the signed PSBT to this method to validate it and move onto the next state.
pub fn finalize_signed_proposal(self, signed_psbt: &Psbt) -> Result<PayjoinProposal, Error> {
let finalized_psbt = self.psbt_context.finalize_proposal(signed_psbt)?;
Ok(PayjoinProposal { payjoin_psbt: finalized_psbt })
}There was a problem hiding this comment.
Agree, will flush out these comments a bit better.
b8a35e8 to
b0d967f
Compare
Roger that, just removed |
|
@spacebear21 if our current targets will use this interface and they support v1, we need to support v1. We're have a clean cut point for v1 support but we haven't hit it yet and forcing it with a regression (which I think not supporting v1 on this new api + migrating e.g. bbmobile to this new api would cause) is gonna bring us bigger problems than some duplicate code will. |
|
Concept ACK. The new I would prefer if the existing callback based fns were refactored to use the newly introduced ones, since the latter are are lower level. That should reduce complexity and improve readability. |
|
Unit tests are definitely coming!
I generally agree, however the existing callback based fns are more compatible with the |
Yes. The docs have always made a point of emphasizing this library does no IO, and while technically true, it was arguably in only weaker sense than it should be, as normally it would kind of imply the library is completely agnostic to all IO considerations, but the callback API does make it harder if the callbacks need to do async IO. IMO that gap was/is a bug. We reasoned we could close this gap without technically breaking semver compatibility as it strictly adds API surface. Since there was no concrete need at the time, that meant it didn't need to be in the 1.0 milestone. I think there is some benefit in frontloading this refactor (i.e. not putting off to a followup PR), which is that dogfooding this the new non-blocking API will help it be narrower (by finding good answers things like the As an example of something I'm hoping such dogfooding of the API can catch, the allocation of the temporary hashmap in the non blocking ownership check may preclude its use on constrained hardware devices where allocation needs to be avoided. Finding a good type signature which is intuitive and easy to use, and which allows avoiding heap allocations would allow avoid a situation in the future where we might need two versions of this functionality, one which avoids allocations and one which is required for backwards compatibility. |
b0d967f to
7a0c45f
Compare
|
PR has been updated to reflect a lot (but not all) of the feedback. Still very preliminary but just wanted to get this out there to verify I'm on the right track. Things updated:
Things not updated:
Please let me know if there is anything that is glaringly wrong. I will continue to refactor and work on the items that have not yet been updated in the next few days. |
| fn from(e: ProtocolError) -> Self { Error::Protocol(e) } | ||
| } | ||
|
|
||
| impl From<ImplementationError> for Error { |
There was a problem hiding this comment.
this extends the public API, and since ImplementationError has some From impls (which arguably also shouldn't be there) effectively allows any &str or Box<dyn Error + Send + sync>) to be converted to an Error
@chavic IIRC you have been auditing and removing some of these to remove unintended pub functionality, thoughts?
the only thing that stands out at the moment is that there is a lot of new |
Pull Request Test Coverage Report for Build 23870476979Details
💛 - Coveralls |
d6102ed to
837ff11
Compare
0xZaddyy
left a comment
There was a problem hiding this comment.
Hi @xstoicunicornx solid work on the implementations, just little observations and i have suggested some changes. also i noticed some new validation methods doesn't have some error handling like finalize_signed_proposal
837ff11 to
39561c8
Compare
|
Thanks for taking a look @0xZaddyy ! All these errors will be going away with the updated implementation however I definitely agree with your suggestions and will keep that error format in mind moving forward. |
| .finalize_proposal(wallet_process_psbt) | ||
| .map_err(|e| Error::Implementation(ImplementationError::new(e)))?; | ||
| Ok(PayjoinProposal { payjoin_psbt: finalized_psbt }) | ||
| let psbt = self.psbt_context.payjoin_psbt_without_sender_signatures(); |
There was a problem hiding this comment.
nit: we could reuse the exposed functions here to make the relationship and usage even clearer:
| let psbt = self.psbt_context.payjoin_psbt_without_sender_signatures(); | |
| let psbt = self.psbt_to_sign(); |
There was a problem hiding this comment.
Agree, good catch. Will fix.
payjoin/src/core/receive/v1/mod.rs
Outdated
| /// is different from the entity that has access to the private keys, | ||
| /// so the PSBT to sign must be accessible to such implementers. | ||
| pub fn psbt_to_sign(&self) -> Psbt { self.psbt_context.payjoin_psbt.clone() } | ||
| /// Use this for asynchronously finalizing psbt |
There was a problem hiding this comment.
Sorry for the late reply! You're right that it implies some kind of asynchronous flow but that doesn't necessarily mean "async" in the parallelism/concurrency sense. For example in the Liana case there are no instances of await/async when finalizing the proposal, but (iiuc) the issue there is that the Liana signer requires a mutable version of the Psbt to manually create and add the witness to it.
I think the phrasing you're using now ("non-blocking") is good, though in each typestate I think it would help to be more explicit about the relationship between the non-blocking method and the data required to compute the result. For example:
/// The Payjoin proposal PSBT that the receiver needs to sign.
/// Once it is signed, use [`finalize_signed_proposal`] to validate and proceed to the next state.
pub fn psbt_to_sign(&self) -> Psbt {
self.psbt_context.payjoin_psbt_without_sender_signatures()
}
/// Process receiver signed payjoin proposal
/// Use this for non-blocking finalization of psbt. First extract [`psbt_to_sign`] and sign it, then
/// pass the signed PSBT to this method to validate it and move onto the next state.
pub fn finalize_signed_proposal(self, signed_psbt: &Psbt) -> Result<PayjoinProposal, Error> {
let finalized_psbt = self.psbt_context.finalize_proposal(signed_psbt)?;
Ok(PayjoinProposal { payjoin_psbt: finalized_psbt })
}
Summary
Currently some states (UncheckedOriginalPayload, MaybeInputsOwned, MaybeInputsSeen, OutputsUnknown, ProvisionalProposal) require the usage of synchronous callbacks to transition to the next state in the payjoin state machine. This is inconvenient for languages which required the use of asynchronous calls for validation, for example when calling the bitcoin rpc, which use the payjoin FFI language bindings. While there are some workarounds that can be used to adequately validate the state machine transitions (except for UncheckedOriginalPayload) when relying on asynchronous validation, these are a bit unwieldily to use. Instead, this PR implements an asynchronous compatible interface that makes it more straightforward to validated the state machine transitions with asynchronous calls.
Background
I was attempting to prove out the Javascript language bindings by implementing a Node version of the payjoin-cli and opened issue #1389 after encountering some of the challenges with using asynchronous validation callbacks. It was revealed to me that this was a known pain point and there was a solution proposed by @arminsabouri of creating new state transition methods that accept the result of a validation call. This is my attempt at implementing this solution.
Any and all feedback is welcome! This is currently in draft status as I am still working on testing and the language bindings.
Pull Request Checklist
Please confirm the following before requesting review:
AI
in the body of this PR.