Add mrd pool cache by Yonghui-Lee · Pull Request #846 · fsspec/gcsfs

Yonghui-Lee · 2026-05-14T02:06:05Z

Summary

This pull request introduces a filesystem-level cache of MRD pools, MRDPoolCache.

Previously, opening a ZonalFile or invoking range operations (like _cat_ranges or read_into_memory_chunked) initialized a new MRDPool (and its constituent MRDs) from scratch. This added significant overhead due to connection setups, handshakes, and stream initialization.

With this change, MRDPoolCache manages MRDPool lifecycle at the filesystem level. It implements an LRU eviction cache for idle pools while using reference counting to protect active pools from eviction. Idle pools hold their underlying gRPC downloaders and streams in a shared queue, enabling reads of the same object generation to reuse existing gRPC connections instantly.

Key Changes

1. Core Cache Logic

MRDPoolCache Class:
- Caches MRD pools keyed by (bucket, object, generation).
- Uses reference counting (_incref, _decref) to track active pools in use.
- Leverages an OrderedDict for LRU eviction of idle pools once the max_idle_pools limit is exceeded.
- Safely handles pool initialization failures by cleaning up partially initialized downloaders and purging the cache key unconditionally.
MRDPool Enhancements:
- Now accepts an optional cache reference.
- Introduces _get_or_create_mrd() to query the cache for an idle MRD before falling back to spawning a new one on-demand.
- On close(), instead of forcefully closing all MRDs, it releases them back to the cache queue (cache.release()) if a cache is attached.

2. Filesystem-level Management

Lifecycle & Finalization:
- ExtendedGcsFileSystem instantiates a single _mrd_pool_cache on construction with customizable size via mrd_pool_cache_size.
- Registers _finalize_mrd_pool_cache weakref finalizers to gracefully close all cached streams on GC, handling various event loop scenarios safely.
Resource Sharing in Readers:
- Updated _cat_ranges, read_into_memory_chunked, and _get_file to fetch MRDPool instances via self._mrd_pool_cache.get(...) rather than constructing raw MRDPool instances.

3. Zonal File Integration

Integrates ZonalFile with the new cache, retrieving the pool synchronously via on initialization.
Removes explicit mrd_pool.initialize call during init, since cache fetching automatically handles the initialization transparently.

4. Test

test_zb_hns_utils.py:
- Comprehensive unit tests verifying MRDPoolCache behaviors
conftest.py:
- Introduced _close_gcs() / _close_gcs_async() to explicitly release gRPC and cache resources in sync/async fixtures, preventing cross-test resource leakage.

codecov · 2026-05-14T02:12:50Z

Codecov Report

❌ Patch coverage is 99.43182% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 88.83%. Comparing base (991faba) to head (f02ea16).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
gcsfs/extended_gcsfs.py	98.41%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #846      +/-   ##
==========================================
+ Coverage   88.52%   88.83%   +0.31%     
==========================================
  Files          15       15              
  Lines        2989     3126     +137     
==========================================
+ Hits         2646     2777     +131     
- Misses        343      349       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

zhixiangli · 2026-05-18T10:40:25Z

        callback = callback or NoOpCallback()

-        mrd = None
+        mrd_pool = await self._mrd_pool_cache.get(bucket, key, generation, pool_size=1)


With a pool size of 1, the pool cache initializes a new pool every time, resulting in no performance improvements and added overhead.

Do you have any measurements showing the improvements made?

+1 to this
@googlyrahman @Yonghui-Lee do we have any macro benchmark results which confirm not just this change but overall MRD pooling is significantly improving performance ? it ll be good to add in description

I added a micro benchmark to show the improvement of read throughput. For the current tessellation macro benchmark, I think pool cache may not improve it because we don't open the same file multiple times in the same process.

ankitaluthra1 · 2026-05-18T17:11:56Z

 from enum import Enum
 from glob import has_magic

+import fsspec


can we make this targetted import ?

It's for fsspec.FSTimeoutError. We use the same import in the core.py.

zhixiangli · 2026-05-20T07:50:26Z

+            raise RuntimeError("ExtendedGcsFileSystem has been garbage collected.")
+
+        if generation is None:
+            info = await fs._info(f"{bucket_name}/{object_name}")


Will this extra round-trip negatively impact performance?

zhixiangli · 2026-05-20T07:52:18Z

+
+        if generation is None:
+            info = await fs._info(f"{bucket_name}/{object_name}")
+            generation = info.get("generation")


Does calling fs._info() on an unfinalized object may return a previous finalized generation (causing the read stream to connect to stale data) or fail entirely if no finalized version exists?

zhixiangli · 2026-05-20T07:59:49Z

    threads: [32]
+
+  - name: "read_repeatedly_open_same_file_fixed_duration"
+    pattern: "repeatedly_open_same_file"


Would reopen be cleaner?

Add mrd pool cache

cb7c907

zhixiangli requested changes May 14, 2026

View reviewed changes

Comment thread gcsfs/zb_hns_utils.py Outdated

Comment thread gcsfs/extended_gcsfs.py

Yonghui-Lee added 4 commits May 15, 2026 00:50

create mrd on initialize

04b6d1e

remove close check

766be8c

fix test

5bd9ec7

add tests

43d8fe4

zhixiangli requested changes May 18, 2026

View reviewed changes

ankitaluthra1 reviewed May 18, 2026

View reviewed changes

Yonghui-Lee added 2 commits May 19, 2026 03:55

add timeout

3e5c6ea

add limit to mrd pool cache queue size

43840a1

Yonghui-Lee force-pushed the shared-mrd-design branch from 06a2250 to 43840a1 Compare May 19, 2026 04:34

add microbenmark

f02ea16

zhixiangli requested changes May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add mrd pool cache#846

Add mrd pool cache#846
Yonghui-Lee wants to merge 8 commits into
fsspec:mainfrom
Yonghui-Lee:shared-mrd-design

Yonghui-Lee commented May 14, 2026

Uh oh!

codecov Bot commented May 14, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhixiangli May 18, 2026

Uh oh!

ankitaluthra1 May 18, 2026

Uh oh!

Yonghui-Lee May 20, 2026

Uh oh!

ankitaluthra1 May 18, 2026

Uh oh!

Yonghui-Lee May 19, 2026

Uh oh!

zhixiangli May 20, 2026 •

edited

Loading

Uh oh!

zhixiangli May 20, 2026

Uh oh!

zhixiangli May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Yonghui-Lee commented May 14, 2026

Summary

Key Changes

1. Core Cache Logic

2. Filesystem-level Management

3. Zonal File Integration

4. Test

Uh oh!

codecov Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

zhixiangli May 18, 2026

Choose a reason for hiding this comment

Uh oh!

ankitaluthra1 May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Yonghui-Lee May 20, 2026

Choose a reason for hiding this comment

Uh oh!

ankitaluthra1 May 18, 2026

Choose a reason for hiding this comment

Uh oh!

Yonghui-Lee May 19, 2026

Choose a reason for hiding this comment

Uh oh!

zhixiangli May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhixiangli May 20, 2026

Choose a reason for hiding this comment

Uh oh!

zhixiangli May 20, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 14, 2026 •

edited

Loading

zhixiangli May 20, 2026 •

edited

Loading