Avoid Stripe Mutex lock contention for RWW #12794

masaori335 · 2026-01-09T02:27:40Z

I found that if OpenDir has own reader-writer lock, it doesn't need StripeSM mutex. This means we can avoid the StripeSM mutex lock contention issue for reader-while-writer cases.

Benchmarking RWW is a bit tricky but one of benchmark says max rps is improved 9.9% in below conditions.

Conditions:

10 urls
plaintext http
response body size: 256 bytes
origin returns Cache-Control: public, max-age=0 ///< some requests goes RWW path
63 exec_thread
40 stripes (8 disks x 5 volumes)

Result:

vanilla: 58,220.9 req/s
patch: 63,999.7 req/s

part of #12788

Copilot

Pull request overview

This PR optimizes cache read performance by introducing a dedicated reader-writer lock (ts::bravo::shared_mutex) in the OpenDir class, reducing mutex contention for reader-while-writer (RWW) scenarios. The change allows concurrent read operations on OpenDir without requiring the StripeSM mutex, resulting in a reported 9.9% improvement in maximum requests per second under specific test conditions.

Changes:

Added ts::bravo::shared_mutex to OpenDir for fine-grained locking instead of relying on StripeSM mutex
Refactored open_read to work without holding the stripe mutex, enabling lock-free RWW path
Introduced new_CacheVC_for_read helper function to reduce code duplication in Cache.cc
Added CACHE_EVENT_OPEN_DIR_RETRY event type for handling retries in the new locking scheme
Made OpenDir members private and reorganized API documentation in StripeSM.h

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
src/iocore/cache/StripeSM.h	Added API grouping comments to clarify OpenDir and PreservationTable methods
src/iocore/cache/P_CacheDir.h	Changed OpenDir to class with private members, added `_shared_mutex` for concurrency control
src/iocore/cache/CacheDir.cc	Implemented shared/exclusive locking in open_write, close_write, and open_read; updated signal_readers for new locking model
src/iocore/cache/Cache.cc	Refactored open_read to check OpenDir without stripe mutex first, added helper function for CacheVC creation
include/iocore/cache/CacheDefs.h	Added CACHE_EVENT_OPEN_DIR_RETRY event type for retry handling

Comments suppressed due to low confidence (1)

src/iocore/cache/CacheDir.cc:193

Potential use-after-free race condition: The open_read() method returns a raw pointer to an OpenDirEntry while holding a shared lock, but the lock is released when the function returns (line 183 creates a scoped lock). The caller then uses this pointer without any lock protection. Meanwhile, close_write() can acquire the writer lock and free the OpenDirEntry (line 174: THREAD_FREE). This creates a window where the returned OpenDirEntry pointer could be freed while the caller is still using it, leading to a use-after-free. The old code avoided this because both operations held the stripe mutex. Consider using reference counting for OpenDirEntry or ensuring the caller holds some protection until done with the entry.

OpenDirEntry *
OpenDir::open_read(const CryptoHash *key) const
{
  ts::bravo::shared_lock lock(_shared_mutex);

  unsigned int h = key->slice32(0);
  int          b = h % OPEN_DIR_BUCKETS;
  for (OpenDirEntry *d = _bucket[b].head; d; d = d->link.next) {
    if (d->writers.head->first_key == *key) {
      return d;
    }
  }
  return nullptr;
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/iocore/cache/CacheDir.cc

src/iocore/cache/Cache.cc

src/iocore/cache/P_CacheDir.h

src/iocore/cache/StripeSM.h

src/iocore/cache/CacheDir.cc

masaori335 · 2026-01-26T02:17:26Z

src/iocore/cache/CacheDir.cc

I found this call back to the CacheVC can be dead lock, when it tries to acquire lock recursively.

I introduced ts::bravo::recursive_shared_mutex with new commit.

masaori335 · 2026-01-27T04:58:54Z

Benchmark of the new commit says almost the same, 58,904 rps vs 65,339 rps.

bryancall · 2026-02-09T21:37:00Z

@masaori335 Is there a guarantee that close_write() can't run between open_read() returning and the caller setting c->od? If the caller and writer are on different EThreads (which they can be for the same stripe), this seems like a valid race. Could open_read() return the entry while still holding the shared lock (e.g., via a lock guard the caller holds)?

Copilot

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-09T21:43:16Z

include/tsutil/Bravo.h

+  void
+  unlock()
+  {
+    if (--_exclusive_count == 0) {
+      _exclusive_owner.store(NO_OWNER, std::memory_order_relaxed);
+      _mutex.unlock();
+    }
+  }


recursive_shared_mutex_impl's shared-under-exclusive path does not acquire an underlying shared lock (it just sets token = 0). If a caller unlocks the exclusive lock while ThreadState::shared_count is still > 0, the thread will no longer hold any lock, and a later unlock_shared may attempt to unlock an underlying shared lock that was never acquired. Add a guard (e.g., debug_assert shared_count==0 before releasing the underlying exclusive lock) or implement a true downgrade mechanism.

Copilot · 2026-02-09T21:43:17Z

include/tsutil/Bravo.h

+  unlock_shared(const Token /* token */)
+  {
+    size_t       tid   = DenseThreadId::self();
+    ThreadState &state = _thread_states[tid];
+    if (--state.shared_count == 0) {
+      // Only unlock underlying mutex if we're not holding exclusive lock
+      if (_exclusive_owner.load(std::memory_order_relaxed) != tid) {
+        _mutex.unlock_shared(state.cached_token);
+      }
+      state.cached_token = 0;


recursive_shared_mutex_impl::unlock_shared ignores the token parameter and decides whether to unlock the underlying mutex based on _exclusive_owner. This makes the correctness of unlock_shared depend on exclusive lock state rather than on how the shared lock was acquired (e.g., the special token=0 path). Consider using the token value (or an explicit per-thread flag) to distinguish the downgrade path reliably and add assertions for misuse.

Suggested change

unlock_shared(const Token /* token */)

{

size_t tid = DenseThreadId::self();

ThreadState &state = _thread_states[tid];

if (--state.shared_count == 0) {

// Only unlock underlying mutex if we're not holding exclusive lock

if (_exclusive_owner.load(std::memory_order_relaxed) != tid) {

_mutex.unlock_shared(state.cached_token);

}

state.cached_token = 0;

unlock_shared(const Token token)

{

size_t tid = DenseThreadId::self();

ThreadState &state = _thread_states[tid];

TS_ASSERT(state.shared_count > 0);

if (--state.shared_count == 0) {

if (token == 0) {

// Downgrade path: shared lock acquired while holding exclusive lock;

// no underlying shared lock to release.

TS_ASSERT(_exclusive_owner.load(std::memory_order_relaxed) == tid);

} else {

// Regular shared lock: must release underlying shared lock.

TS_ASSERT(_exclusive_owner.load(std::memory_order_relaxed) != tid);

_mutex.unlock_shared(token);

state.cached_token = 0;

}

Copilot · 2026-02-09T21:43:17Z

src/tsutil/unit_tests/test_Bravo.cc

+    std::vector<std::thread> threads;
+    for (int i = 0; i < NUM_THREADS; ++i) {


This file now uses std::vector for thread storage but does not include . Please include explicitly to avoid build breaks from missing transitive includes.

Copilot · 2026-02-09T21:43:17Z

src/tsutil/unit_tests/test_Bravo.cc

+    ts::bravo::recursive_shared_mutex mutex;
+    std::atomic<int>                  readers_done{0};
+


This file now uses std::atomic (e.g., readers_done) but does not include . Please add the missing header explicitly rather than relying on transitive includes.

Copilot · 2026-02-09T21:43:17Z

src/iocore/cache/CacheDir.cc

+      EThread *t1      = newly_delayed_readers.head->mutex->thread_holding;
+      if (!t1) {
+        t1 = mutex->thread_holding;
+      }


When re-scheduling retries, t1 can still be null (e.g., if the head VC's mutex isn't currently held and OpenDir's mutex isn't held in this call path). t1->schedule_in(...) will then dereference null. Ensure t1 is always set to a valid thread (e.g., fall back to this_ethread()) before calling schedule_in.

Suggested change

}

}

if (!t1) {

t1 = this_ethread();

}

Copilot · 2026-02-09T21:43:18Z

src/iocore/cache/StripeSM.h

+{
+  std::lock_guard lock(_shared_mutex);
+
+  return write_op();
+}


StripeSM::write_op (and similarly read_op) unconditionally does return functor();. This breaks when the helper is instantiated for void (e.g., CacheDir.cc calls _stripe->write_op<void>(...)) because return expr; is ill-formed in a void-returning function. Consider switching to decltype(auto) and handling void via if constexpr, or add dedicated void overloads.

Copilot · 2026-02-09T21:43:18Z

src/iocore/cache/CacheDir.cc

+    EThread                                  *t = mutex->thread_holding;
+    CacheVC                                  *c = nullptr;
+    while ((c = _delayed_readers.dequeue())) {
+      CACHE_TRY_LOCK(lock, c->mutex, t);
+      if (lock.is_locked()) {


EThread *t = mutex->thread_holding; can be null when signal_readers is invoked directly from close_write (OpenDir's ProxyMutex isn't held in that path). CACHE_TRY_LOCK will dereference t (and the lock-failure injection code uses t->generator), so this can crash. Use this_ethread() (or another guaranteed non-null thread) as a fallback.

masaori335 added this to the 10.2.0 milestone Jan 9, 2026

masaori335 self-assigned this Jan 9, 2026

masaori335 added Cache Performance labels Jan 9, 2026

bryancall requested review from bryancall and Copilot January 12, 2026 22:52

Copilot started reviewing on behalf of bryancall January 12, 2026 22:52 View session

Copilot AI reviewed Jan 12, 2026

View reviewed changes

masaori335 commented Jan 26, 2026

View reviewed changes

Avoid Stripe Mutex lock contention for RWW

1812d90

masaori335 force-pushed the asf-master-rww-lock branch from 3a775da to 1812d90 Compare January 27, 2026 04:56

bryancall requested a review from Copilot February 9, 2026 21:36

Copilot started reviewing on behalf of bryancall February 9, 2026 21:37 View session

Copilot AI reviewed Feb 9, 2026

View reviewed changes

		std::vector<std::thread> threads;
		for (int i = 0; i < NUM_THREADS; ++i) {

		ts::bravo::recursive_shared_mutex mutex;
		std::atomic<int> readers_done{0};

Avoid Stripe Mutex lock contention for RWW #12794

Are you sure you want to change the base?

Avoid Stripe Mutex lock contention for RWW #12794

Uh oh!

Conversation

masaori335 commented Jan 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

masaori335 Jan 26, 2026

Choose a reason for hiding this comment

Uh oh!

masaori335 Jan 27, 2026

Choose a reason for hiding this comment

Uh oh!

masaori335 commented Jan 27, 2026

Uh oh!

bryancall commented Feb 9, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants