Skip to content

Conversation

@seigert
Copy link
Contributor

@seigert seigert commented Sep 15, 2025

Hello there!

While benchmarking our internal fs2-grpc-based gRPC server against both ghz and our fs2-grpc-based client I've found that througthput of fs2-grpc client is 5-10 times slower than that of ghz for streaming calls.

After some research I've found that

  1. fs2.grpc.client.StreamIngest requests messages from channel one-by-one, even for prefetchN > 1;
  2. it also emits received messages one by one, without any chunking even if multiple messages were available in internal queue at the time of pull;
  3. For client -> server streaming calls prefetchN is always equals to 1 and it's not possible to set another value from server options.

Most likely all of this are related to #386 and #503 -- but all activity there ended 2-3 years ago. :(

So I've decided to reimplement a little internal logic of StreamIngest, now it is:

  1. requests messages from channel in bulks of max(0, (limit - (queued + already_requested))) every time internal message queue is either empty or blocked;
  2. emits received messages in chunks, trying to totally empty internal queue each pull.

According to my bechmarks, this improves both client- and server-side streaming throught 2-3 times both in individual messages per second and in megabytes per second.

This implementation is still about 3 times slower than ghz for the requests of same message payload size, but I think to improve it further some work must be done mostly in grpc-java internals with implementation of backpressure and message decoding.

@seigert seigert force-pushed the improve_stream_ingest_thoughtput branch from c6e6640 to ebe8644 Compare September 15, 2025 08:57
@ahjohannessen ahjohannessen merged commit e86e67d into typelevel:main Sep 17, 2025
5 checks passed
@ahjohannessen
Copy link
Collaborator

@seigert I have released v2.9.0 with your changes.

@seigert
Copy link
Contributor Author

seigert commented Sep 17, 2025

@ahjohannessen Thanks!

@ahjohannessen
Copy link
Collaborator

@seigert Seems like something broke with streaming. In our play framework app that streams grpc over the wire from a doobie backed service app, e.g.:

rpc getSomething    (GetSomething)    returns (stream Something);
...

now hangs and times out with DEADLINE_EXCEEDED. I suspect it is related to your removal of val acquire = start(create, md) <* request(prefetchN) that somehow kickstarted streaming results.

@seigert
Copy link
Contributor Author

seigert commented Sep 25, 2025

@ahjohannessen

hangs and times out with DEADLINE_EXCEEDED.

That's very strage, because most of my tests used exact same interop: unary client request and streaming server reply and not once I've got DEADLINE_EXCEEDED error?

Can you maybe patch your local version of fs2-grpc with request(prefetchN) in acquire and check that timeout error disappears?

I'll try to create new MR with old acquire logic, just need to decide a best way to pass 'already requested' number of messages to StreamIngest constructor.

@ahjohannessen
Copy link
Collaborator

ahjohannessen commented Sep 25, 2025

@seigert

Can you maybe patch your local version of fs2-grpc with request(prefetchN) in acquire and check that timeout error disappears?

Yes, I was thinking of doing that. The thing I noticed were unary_to_streaming methods hanging - All of these using doobie streaming. Something odd I experienced was that if I tried to re-request by invoking different endpoints it sometimes went through and a streaming response was returned, but worked like 1/20 times. So perhaps some edge condition.

request(prefetchN)

Perhaps just request(1) to get things "moving" is all what is necessary.

@ahjohannessen
Copy link
Collaborator

@seigert Please take a look at #800

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants