-
Notifications
You must be signed in to change notification settings - Fork 61
Remove DataStream, support binary sync lines over HTTP
#821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
🦋 Changeset detectedLatest commit: 2b9eca2 The changes in this PR will be included in the next version bump. This PR includes changesets to release 8 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
51bdf95 to
a41e424
Compare
rkistner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the change to remove DataStream - I agree plain AsyncIterators are better here.
I do think we need to test this properly to make sure there are no regressions. I also wonder whether we should stick to JSON for HTTP streams to start with, and make it either opt-in, or part of a major release to switch to BSON? Apart from potential bugs, I've seen cases where BSON is actually larger than the equivalent JSON for the same amount of data.
What is the default for HTTP in other SDKs currently?
a41e424 to
2b9eca2
Compare
Dart, Kotlin and native all unconditionally prefer BSON when the Rust client is used, they'd only use JSON on older service versions.
To be fair, I would expect the (currently JSON strings) oplog data lines to make up the bulk of traffic for PowerSync. So whether the container around that is BSON or JSON is probably insignificant.
I'm not a fan of making this an option. If using BSON has benefits, we should aim to use it by default. If it's worse, we shouldn't have it in our SDKs at all. So at this point, I'm actually not sure if adding BSON support over HTTP is something worth doing. It being cheaper to decode in the core extension (even with oplog data being JSON, we don't have to scan for escape characters in those large buffers) is the only current benefit I see. I can remove BSON support from this PR specifically to get async iterators done earlier, but we should come up with a plan for this. We should have a consistent approach to this in our SDKs, so if we don't want BSON we should remove it from the sync service HTTP endpoint and our other SDKs as well. |
|
My biggest concern is that we shouldn't switch from json to bson by default in a patch release. It can make sense to do the switch for the future plan, but then we need to make sure to test the performance properly, and make it a new minor version at minimum. |
This refactors
AbstractRemoteandAbstractStreamingSyncImplementationto have clearer control flow and to support streaming BSON documents over HTTP.In our SDK, we're using the
DataStreamutility as a polyfill replacing parts of theReadableStreamsweb API and async iterables in JavaScript (since RN supports neither). However, that abstraction turned out to hurt us quite a bit:Clearly, we're having a hard time using
DataStreamcorrectly. By design,DataStreamis the equivalent of an asynchronous channel supporting multiple producers and multiple consumers. This makes implementing it quite complicated. However, we don't need most of that in practice!DataStreamto iterate over sync lines in an asyncwhileloop. There is never any concurrency on the consuming side, and there is never more than one consumer.This PR replaces
DataStreamwith a variant ofAsyncIteratorthat only has anextmethod. This encodes the expectation that we'll only ever have one listener owning the stream. I have also deliberately hidden thereturnmethod. Instead, the only way to cancel a stream is to pass anAbortSignalto the place creating it and marking that as aborted. This simplifies handling streams in the remote implementation, and makes data flow a bit easier to understand:fetch, no translation is necessary because the default response reader already implements the format we need.event-iteratorpackage we already depend on and bundle. We just need to rewriteSymbol.asyncIteratorto a polyfill for RN.AbstractRemote, making control flow easier to reason about.This also allows us to reuse the networking logic between the legacy and the Rust sync client. Finally, I've added support for receiving binary sync lines over HTTP, a feature we support in our Kotlin and Dart SDKs.