fix(http): shared aiohttp session, headers mutation, store.delete params#82
fix(http): shared aiohttp session, headers mutation, store.delete params#82SiddarthAA wants to merge 1 commit intoJigsawStack:mainfrom
Conversation
Three correctness and performance bugs fixed across the HTTP layer: 1. perf(async_request): replace per-request aiohttp.ClientSession with a module-level shared session (_get_shared_session). Previously every AsyncRequest method created a brand-new aiohttp.ClientSession per call, bypassing connection pooling entirely and incurring TCP handshake overhead on every API request. The new helper lazily creates one module-level session and reuses it for the lifetime of the process, matching the pattern explicitly recommended by aiohttp: https://docs.aiohttp.org/en/stable/client_quickstart.html\#make-a-request 2. fix(request, async_request): prevent __get_headers from mutating the caller's headers dict. Both Request and AsyncRequest stored a direct reference to the headers dict from the config TypedDict and called .pop('Content-Type') on it during multipart file-upload requests. This permanently removed the key from the shared config dict, so any subsequent request using the same client config would silently lose its Content-Type header. The fix: - __init__ now copies the incoming headers into self.headers so the original config is never touched. - __get_headers builds its own fresh dict and strips Content-Type from a local copy of self.headers, not from self.headers itself. 3. fix(store): Store.delete and AsyncStore.delete passed params=key instead of params={}. The key was already encoded in the URL path (/store/file/read/{key}). Passing the raw key string as the params argument could corrupt the DELETE request. Both sync and async delete methods now correctly use params={}.
Please clean up the PR description to write updates by hand if possible or optimized by ai. Easier to work on this than an much longer AI-generated doc :) thanks. |
|
Thank for the detailed review @Khurdhula-Harshavardhan :D Agreed on splitting this up, I'll move the confirmed bug (claim 2) and the header/delete changes into a separate, minimal PR (with tests)! I’ll drop the session lifecycle changes from this PR and revisit them deeply since they need a more careful design (especially around event loop safety and reuse, sounds painful :p) Also fixing the DELETE wording and cleaning up the PR description to reflect only what’s actually in the diff. Will update this ASAP! thankyou! |
Thanks again for the detailed review @Khurdhula-Harshavardhan — made me dig deeper into the codebase :p I’ve split out the confirmed fixes into smaller, focused PRs to make review easier. I’m closing this PR to keep things clean and easier to reason about. I’ll follow up with a separate PR for the session lifecycle changes after revisiting the design in abit more detailed manner! Appreciate the guidance — would love to make alot more contributions! |
Overview
Three fixes to the HTTP layer — two correctness bugs and one performance issue. All changes are backwards-compatible with no API surface changes.
Changes
1.
perf(async_request)— Replace per-requestClientSessionwith a shared module-level sessionFile:
async_request.pyEvery async request was spinning up a brand-new
aiohttp.ClientSession(), which aiohttp explicitly documents as an antipattern. Each session tears down its TCP connection on completion, meaning every API call was paying a full handshake cost with zero connection pooling.Introduced a module-level
_shared_sessionwith a lazy_get_shared_session()initializer. The session is created once and reused for the process lifetime; if it's been externally closed, the helper recreates it automatically. Allperform_*methods now go through this instead of managing their own sessions.2.
fix(request, async_request)—__get_headerswas permanently mutating the caller's config dictFiles:
request.py,async_request.pyBoth
RequestandAsyncRequestheld a direct reference to theheadersdict from the caller's config — no copy, justself.headers = config.get("headers"). During multipart file-upload requests,__get_headers()calledself.headers.pop("Content-Type")to letrequests/aiohttpset the multipart boundary themselves.The problem: that
popwas operating on the original config dict. Every subsequent request reusing the same config — which is exactly what all service classes do — would silently go out missingContent-Type, causing failures that are genuinely hard to trace back to the source.Fix applied at two levels:
__init__now doesdict(raw_headers)soself.headersis always an owned copy__get_headers()makes a second local copy before calling.pop(), so neitherself.headersnor the original config are ever touched3.
fix(store)—Store.deletewas passing the key string asparamsFile:
store.pyBoth sync and async
delete(key)were calling the underlying request withparams=key, wherekeyis a plain string. The key is already encoded in the URL path (/store/file/read/{key}) — it has no business being inparams, and passing a bare string instead of a dict is undefined behaviour depending on how the HTTP client decides to interpret it.Changed
params=key→params={}in bothStore.deleteandAsyncStore.delete.Tests
8 unit tests added, all runnable locally without an API key:
headersdict is not mutated by a multipart callContent-Type(lets the client set the boundary)_get_shared_session()returns the sameClientSessioninstance on repeated callsStore.deletesendsparams={}AsyncStore.deletesendsparams={}Content-Type: application/jsoncorrectlyContact
Thanks for reviewing — happy to adjust anything or split into separate PRs if preferred :D
siddartha_ay@protonmail.com | (everything about me at siddartha-ay.xyz )