Skip to content

Better vcztools concurrency#9

Merged
jeromekelleher merged 6 commits intosgkit-dev:mainfrom
jeromekelleher:better-vcztools-concurrency
May 6, 2026
Merged

Better vcztools concurrency#9
jeromekelleher merged 6 commits intosgkit-dev:mainfrom
jeromekelleher:better-vcztools-concurrency

Conversation

@jeromekelleher
Copy link
Copy Markdown
Member

No description provided.

@jeromekelleher jeromekelleher force-pushed the better-vcztools-concurrency branch from 5cd7cdb to 73258ce Compare May 5, 2026 14:03
The FUSE handler used to await indefinitely on any operation against
the plink-server subprocess: a slow or wedged worker pinned the
consumer's syscall in uninterruptible D-state, fusermount3 -u
couldn't unmount, and the mount was unrecoverable without SIGKILL.

Three concentric timeouts close that gap:

- BedConnection.read wraps the lock-and-I/O block in
  ``trio.move_on_after(_REQUEST_TIMEOUT_S)``. On expiry the
  connection is marked closed (so queued readers short-circuit
  immediately), the stream is best-effort closed inside a shielded
  scope, and ``OSError(EIO)`` is raised — already mapped to
  ``FUSEError(EIO)`` by PlinkOps.read.

- BedConnection.aclose runs its send_eof + aclose under shielded
  ``trio.move_on_after(_ACLOSE_TIMEOUT_S)``, so release can never
  outlive its deadline. PlinkOps.release already swallows aclose
  exceptions; with the deadline in place it can no longer hang.

- PlinkClient._connect_stream wraps each ``connect()`` with
  ``trio.fail_after(_OPEN_TIMEOUT_S)`` so a stuck accept queue can't
  lock the parent.

- PlinkOps.open wraps the limiter acquire with
  ``trio.move_on_after(_LIMITER_TIMEOUT_S)`` and surfaces
  ``FUSEError(EAGAIN)`` on expiry, guarding against a leaked slot
  permanently wedging FUSE_OPEN.

Verified by killing the plink-server mid-mount: dd surfaces ECONNREFUSED
in 12ms and fusermount3 -u completes in 10ms. fs_tests fio runner
goes from a 1314s hang-with-SIGKILL to a 161s clean fail-fast where
every job either succeeds or returns an errno within its configured
runtime.
The robustness layer made the mount durable, but EAGAIN was still
firing under repeated fio churn and we couldn't tell whether the
slow path was parent-side limiter contention, parent-side aclose,
or server-side encoder shutdown.

Extend the access log with a ``kind`` field (default ``read``) and
emit lifecycle events alongside reads:

- ``open``      — PlinkOps.open entry → return.
- ``limiter_wait`` — only for .bed; the time spent on the
  CapacityLimiter acquire.
- ``release``   — PlinkOps.release entry → return.
- ``aclose``    — BedConnection.aclose entry → return, via a new
  ``on_aclose`` callback hook the FUSE layer wires in at open time.

Server-side: add DEBUG-level logs around encoder construction,
each encoder.read, and encoder.close so a -vv run records the
per-connection timing on the worker.

Add ``fs_tests/harness/trace_summary.py`` to print per-kind
duration distributions and the slowest events from an access.jsonl;
running it on a real fio trace pinpointed the bottleneck as read
latency itself (max ~1 s) rather than anything FUSE-side
(every lifecycle op is sub-5 ms).

Two existing tests adapted: ``test_records_per_read`` filters to
``kind="read"`` now that lifecycle events share the log; the
in-test fakes accept ``on_aclose`` and propagate it.
The plink-server's ``logger.debug``/``info`` messages were going
nowhere: the subprocess never called ``logging.basicConfig``, so the
default ``WARNING`` level dropped them. Pass the parent's effective
level into ``_server_main`` and configure logging at entry, so a
``-vv`` parent run now also shows the per-connection encoder
lifecycle traces (``encoder created in N.NNNs``,
``encoder.read off=O size=S in N.NNNs``, ``encoder.close done in
N.NNNs``).

Default of ``WARNING`` keeps subprocess logs quiet under normal
operation.
- BedConnection.read: replace ``assert cs.cancelled_caught`` with an
  explicit ``if not ... raise RuntimeError`` so the timeout EIO path
  stays correct under ``python -O``. The branch is unreachable in
  normal operation; tagged with a coverage pragma.
- _server_main: pass ``force=True`` to ``logging.basicConfig`` so the
  explicitly-forwarded log level wins even if an upstream import (or
  the parent's logging state replayed via ``spawn``) has already
  configured the root logger.
@jeromekelleher jeromekelleher merged commit 9702ff5 into sgkit-dev:main May 6, 2026
5 checks passed
@jeromekelleher jeromekelleher deleted the better-vcztools-concurrency branch May 6, 2026 10:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant