Skip to content

Image upload sometimes stalls with HTTP/2 #3559

@luqmana

Description

@luqmana

This was the image upload failing on Firefox/macOS bug that @david-crespo was running into.

I've looked into it some more and from what I can tell, at some point the browser gets stalled while uploading some chunks.
On the console side we split up the file into 384KiB chunks which we try to upload 6 at a time (to not hit browser concurrency limits). It doesn't happen every time, but every so often there will be a chunk or two where it seems like the browser made the request but there's no response. (At least from the browser Web Developer Tools Network tab).

I changed the console side to add a query parameter on each individual chunk upload and for the stalled chunks I saw no mention of such a request in the Nexus logs. Next I tried a packet capture with Wireshark and after setting up SSLKEYLOGFILE (because this will only repro with https; we'll get back to that) I did see the browser making those requests. And in fact, I could see it sending until at some point it stops with still no response.

After not being able to repro without TLS and then seeing the packet cap, I realized we're using HTTP/2 for compatible clients. The browser is maintaining a single connection and using multiple HTTP/2 streams to make the different requests. Ok, so is something else blocking our image upload somehow? Cue some more reading about HTTP/2 and it definitely has a concept of flow control that each peer maintains separately.

Basically, for each side, there's a connection level flow control window size as well as a per-stream size. Every byte sent decrements the available bytes in the window. If the sender has exhausted the window size, they must not send anymore until a WINDOW_UPDATE is received from the peer that tells it there's more space.

I need to look into it some more but it seems like the browser might think its exhausted the window but there's no window update from the nexus size. Hacked in a tracing subscriber to see anything useful from hyper and I do see mentions of the stalled streams but not super familier with hyper enough to decode them yet. (hyperium/hyper#2899 seems relevant maybe?)

Metadata

Metadata

Assignees

No one assigned

    Labels

    known issueTo include in customer documentation and training

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions