Add the new readahead_v2 cache#750
Conversation
|
There is clearly some thought and effort going into this implementation. Might I suggest that we try to keep to This is connected to the effort in Jamie-Chang/aiointerpreters#8 to bring true parallelism to fsspec: memoryview objects are zero-cost passing between interpreters, the only python object this applies to. |
|
@martindurant, Thanks for taking an early look, and leaving your view. I actually did consider a This particular solution avoids creating copies and blocking the CPU. It stores those 5MB As I understand it, to use |
2f3f115 to
b834d74
Compare
|
/gcbrun |
b834d74 to
703aa8f
Compare
|
/gcbrun |
c82bb37 to
57538b5
Compare
|
/gcbrun |
1 similar comment
|
/gcbrun |
|
Yeah, I did some reading around and indeed I see no way to ingest memoryviews directly out of aiohttp (or any other network client). This seems like a mistake to me! Maybe someone should implement the buffers in rust to give true zero-copy behaviour... |
|
Yeah that's correct, but that's only half of the problem. Even if we hypothetically achieved zero-copy ingestion using By contrast, the To completely solve the problem for any request size, we would need two things:
Until then, the proposed solution optimizes performance specifically for cases where Let me know if you have any questions, or if you have a better approach to this I would love to hear it! |
|
Not particularly useful for you right now, but It might interest you that in Python 3.15 there will be a zero copy way to get bytes out of a bytearray https://docs.python.org/3.15/library/stdtypes.html#bytearray.take_bytes |
googlyrahman
left a comment
There was a problem hiding this comment.
Addressed comments.
|
Added the before vs after comparision as well in the description for regional as well as zonal. Please take a look once you've time :) |
|
/gcbrun |
0fc254a to
7dc29ee
Compare
|
/gcbrun |
2 similar comments
|
/gcbrun |
|
/gcbrun |
2d7ae5f to
9cfcf0a
Compare
9cfcf0a to
6cf0910
Compare
|
Had to rebase, and force-push it as it was having merge conflicts with |
|
/gcbrun |
Adds the new
readahead_v2cache specifically for GCSFS.The Problem: The existing readahead cache minimizes network connections by over-fetching (e.g., fetching 10MB when 5MB is requested). However, it stores this data as a single, contiguous bytes object. Serving a read request requires slicing this object, which in Python triggers a memory copy. For large block sizes (e.g., 100MB or 500MB), this copying operation becomes CPU-intensive and blocks the asyncio event loop, degrading performance.
The Solution: The new
readahead_v2cache implementation stores fetched chunks as separate bytes objects rather than concatenating them. When a read request matches a stored chunk, the cache returns a direct reference to the object. This "zero-copy" approach eliminates the overhead of slicing and prevents CPU/event loop blocking, ensuring efficient memory usage when the request size aligns with the block size.The cache is currently being integrated into the system only when
EXPERIMENTALenvironment variable is set, so basically as an opt-in feature. We plan to make it default in subsequent release.NOTE: Although we could optimize memory usage further, we chose to maintain the exact semantics of the original readahead cache. This ensures stability for unlikely but valid read patterns.