perf: allow multiple DATA frames per write#903
Open
NedAnd1 wants to merge 2 commits intohyperium:masterfrom
Open
perf: allow multiple DATA frames per write#903NedAnd1 wants to merge 2 commits intohyperium:masterfrom
NedAnd1 wants to merge 2 commits intohyperium:masterfrom
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR addresses the sender-side perf issue described in #902
by allowing multiple DATA frames to be coalesced into a single write_vectored() syscall,
rather than the current behavior of flushing each DATA frame individually.
Problem
The existing encoder holds at most one pending DATA frame at a time and blocks accepting new frames until it is fully flushed. With TCP_NODELAY enabled, each frame triggers its own write() syscall and TCP segment — inflating the syscall-to-payload ratio and cutting throughput roughly in half compared to raw TCP.
Solution
Option<Next<B>>slot with aVecDeque<BufElement<B>>that can queue multiple DATA frames (up to 512 on vectored-IO transports).Bufdirectly on the Encoder, providingchunks_vectored()so that all queued frame headers + payloads are written in onepoll_write_vectored()call.in_flight_data_framefield with a per-streamin_flight_partial_sendfield of typeOption<ControlFlow<()>>, allowing each stream to send up to one partial data frame at a time without blocking other streams.take_used_data_frames()iterator to reclaim all fully-written frames in a single pass, replacing the oldtake_last_data_frame()which could only return one.poll_write_bufthat returnsControlFlowto cleanly distinguish "nothing left to write" from "wrote some bytes, keep going".Validation
Perf results for a branch including the commit in this PR:
Receiver-side PR: #904