Skip to content

Add the new readahead_v2 cache#750

Merged
ankitaluthra1 merged 6 commits intofsspec:mainfrom
ankitaluthra1:zonalcache
Jan 29, 2026
Merged

Add the new readahead_v2 cache#750
ankitaluthra1 merged 6 commits intofsspec:mainfrom
ankitaluthra1:zonalcache

Conversation

@googlyrahman
Copy link
Copy Markdown
Collaborator

@googlyrahman googlyrahman commented Jan 19, 2026

Adds the new readahead_v2 cache specifically for GCSFS.

The Problem: The existing readahead cache minimizes network connections by over-fetching (e.g., fetching 10MB when 5MB is requested). However, it stores this data as a single, contiguous bytes object. Serving a read request requires slicing this object, which in Python triggers a memory copy. For large block sizes (e.g., 100MB or 500MB), this copying operation becomes CPU-intensive and blocks the asyncio event loop, degrading performance.

The Solution: The new readahead_v2 cache implementation stores fetched chunks as separate bytes objects rather than concatenating them. When a read request matches a stored chunk, the cache returns a direct reference to the object. This "zero-copy" approach eliminates the overhead of slicing and prevents CPU/event loop blocking, ensuring efficient memory usage when the request size aligns with the block size.

The cache is currently being integrated into the system only when EXPERIMENTAL environment variable is set, so basically as an opt-in feature. We plan to make it default in subsequent release.

NOTE: Although we could optimize memory usage further, we chose to maintain the exact semantics of the original readahead cache. This ensures stability for unlikely but valid read patterns.

@martindurant
Copy link
Copy Markdown
Member

There is clearly some thought and effort going into this implementation. Might I suggest that we try to keep to memoryview objects instead of bytes? They certainly allow zero-copy referencing and slicing. Unfortunately, the fsspec APIs generally return bytes in all cases, so I wonder whether we can make a memoryview-backed bytes-like object.

This is connected to the effort in Jamie-Chang/aiointerpreters#8 to bring true parallelism to fsspec: memoryview objects are zero-cost passing between interpreters, the only python object this applies to.

@googlyrahman
Copy link
Copy Markdown
Collaborator Author

@martindurant, Thanks for taking an early look, and leaving your view.

I actually did consider a memoryview solution before arriving at this one, but my understanding is that it isn't performance-improving in this context. The main reason is exactly what you noted: "Unfortunately, the fsspec APIs generally return bytes in all cases." That is the core issue. We receive data in bytes from aiohttp (it allocates bytes internally even if we pass a buffer). We could convert that to a memoryview without hitting the CPU, but the moment we serve read calls to customers, we can't return the memoryview; we have to return bytes to the user to maintain backward compatibility and API contract, Doing so (memoryview.to_bytes()) creates a fresh copy, which hits the CPU and hampers performance.

This particular solution avoids creating copies and blocking the CPU. It stores those 5MB aiohttp responses as-is and serves the user a reference rather than creating a new copy. This works because Python bytes are immutable, so even if we pass the reference, the user cannot change it.

As I understand it, to use memoryview effectively, the fsspec API contract would need to change, which likely isn't feasible at the moment or maybe a long term effort where users would need to migrate their workload from bytes to memoryview, This solution maintains the backward compatibility, and improves performance. With that said, please let me know if I missed anything or if there is a different way to approach this.

@googlyrahman
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@googlyrahman
Copy link
Copy Markdown
Collaborator Author

/gcbrun

@googlyrahman googlyrahman marked this pull request as ready for review January 20, 2026 05:33
@googlyrahman googlyrahman force-pushed the zonalcache branch 5 times, most recently from c82bb37 to 57538b5 Compare January 20, 2026 09:18
@googlyrahman
Copy link
Copy Markdown
Collaborator Author

/gcbrun

1 similar comment
@ankitaluthra1
Copy link
Copy Markdown
Collaborator

/gcbrun

@martindurant
Copy link
Copy Markdown
Member

Yeah, I did some reading around and indeed I see no way to ingest memoryviews directly out of aiohttp (or any other network client). This seems like a mistake to me! Maybe someone should implement the buffers in rust to give true zero-copy behaviour...

@googlyrahman
Copy link
Copy Markdown
Collaborator Author

Yeah that's correct, but that's only half of the problem. Even if we hypothetically achieved zero-copy ingestion using memoryview internally, My understanding says it would still be less efficient for a cache under the current API constraints. Since fsspec mandates returning bytes, a memoryview approach would force us to run .tobytes() on every cache hit to satisfy the contract. This means we would pay the CPU copy 'tax' repeatedly every time the user reads the same chunk, or there is cache hit.

By contrast, the readahead_v2 approach leverages the immutability of bytes. We pay the allocation tax exactly once (during the initial download). All subsequent reads whether immediate or from cache are just reference passing, which is truly zero-copy and zero-CPU. This also avoids a memory spike: with memoryview, serving a read would temporarily require 2x RAM (the internal buffer + the new bytes copy). With the current approach, the cache and the user share the exact same memory object.

To completely solve the problem for any request size, we would need two things:

  • Support for reading directly into a memoryview in aiohttp (or the network client) with zero copy. (Current not exists)
  • An update to the fsspec contract to return memoryview (which would be a breaking or backward incompatible change)

Until then, the proposed solution optimizes performance specifically for cases where request_size == block_size.

Let me know if you have any questions, or if you have a better approach to this I would love to hear it!

Comment thread gcsfs/core.py Outdated
@Jamie-Chang
Copy link
Copy Markdown

Not particularly useful for you right now, but It might interest you that in Python 3.15 there will be a zero copy way to get bytes out of a bytearray https://docs.python.org/3.15/library/stdtypes.html#bytearray.take_bytes

Comment thread gcsfs/core.py Outdated
Comment thread gcsfs/extended_gcsfs.py
Comment thread gcsfs/zonal_file.py
Copy link
Copy Markdown
Collaborator Author

@googlyrahman googlyrahman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed comments.

@googlyrahman
Copy link
Copy Markdown
Collaborator Author

Added the before vs after comparision as well in the description for regional as well as zonal. Please take a look once you've time :)

@ankitaluthra1
Copy link
Copy Markdown
Collaborator

/gcbrun

Comment thread gcsfs/core.py Outdated
@ankitaluthra1
Copy link
Copy Markdown
Collaborator

/gcbrun

2 similar comments
@ankitaluthra1
Copy link
Copy Markdown
Collaborator

/gcbrun

@ankitaluthra1
Copy link
Copy Markdown
Collaborator

/gcbrun

@googlyrahman
Copy link
Copy Markdown
Collaborator Author

Had to rebase, and force-push it as it was having merge conflicts with fsspec/main

@ankitaluthra1
Copy link
Copy Markdown
Collaborator

/gcbrun

@ankitaluthra1 ankitaluthra1 merged commit fcefacb into fsspec:main Jan 29, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants