SocketManager.Shared & lock contention vs SocketManager.ThreadPool and starvation - recommendation #3060

petrroll · 2026-04-21T16:49:04Z

petrroll
Apr 21, 2026

Hi, we're high throughput (esp. loads of very small reads, 1s timeout on layer above the reads) users of SRE and are not sure about what are the recommendations w.r.t. to SocketManager. setting.

Few years ago (up until 2020) we used to run with SocketManager.Shared. Unfortunately we had semi frequent lock contention issues that (heavily) suggested DedicatedThreadPoolPipeScheduler being the culptit.

---

---

This was happening mainly on smaller nodes (we used to run 2 core VMs).

Since then we switched to SocketManager.ThreadPool and generally stopped having problems. In recent months, however, we noticed that when pods are cold (just started) the first batch of request can overwhelm the shared threadpool causing timeouts on redis layer and (dead socket detected because no read, ...) which makes the issue worse, ... .

The solution, we thought - would be to have a dedicated threadpool. So that the tasks for SRE (read, deserialize, ...) don't compete with our run-away threadpool (if its full of tasks, they don't get to be at the end of the queue) and - assuming CPU scheduler is just and the number of threads isn't extremely ~~dumb~~ high - Redis makes progres and doesn't timeout when it doesn't need to.

But then we remembered the issue from 2020 and now aren't sure...

I've also checked and it seems SocketManager.Shared still uses the above-mentioned scheduler and that scheduler hasn't been changed in 7 years (so even in 2020 we likely run the same code) and even now it's decently lock-y.

Can you provide any guidance?

Also, what about an option of a dedicated SRE threadpool but using .NET threadpool tech? You should be able to create dedicated separate threadpool with the same lock-less semantics and per thread queues etc.. Wouldn't that work better?

Any opinion / suggestion heavily appreciated :)

mgravell · 2026-04-21T17:35:31Z

mgravell
Apr 21, 2026
Maintainer

Short answer: V3 heavily changes this, hopefully to your benefit! The internal thread-pool API completely goes away (or at least: is marked obsolete and is not used - I don't like binary breaks). However, a key part of this is still outstanding in V3, and will be resolved next time I get chance to work on it, specifically: the synchronous dedicated reader scenario).

So: I wouldn't get excited trying to change things today, and please don't try V3 for this right now, as I've had to default the ideal mode to "off" for now, pending a fix.

13 replies

petrroll Apr 22, 2026
Author

btw: dotnet/runtime#127276

petrroll Apr 22, 2026
Author

Yeah, but we connect to like 8 redis clusters (even more somewhere) so it gets pretty high pretty fast.

And in containers linux scenario having too wide parallelism can be dangerous as cgroup limits/requests don't limit cpu cores but some of quota. So if you run too wide you get super short quota each period. And that can mess up all sorts of things.

(you could pin, and arguably should but ....)

mgravell Apr 22, 2026
Maintainer

Then async should be ideal. We're removing at least one switch point, so it should be all gain. In terms of execution context, the main aspect of relevance there is sync context. I believe if we hacked in a custom sync context for the reader thread, we'd actually add a hop, i.e. IO=>TP=>SC, but I'd need to check; I don't think we can get IO=>SC without the TP intermediate step, but: I'll check. There's also the problem of ConfigureAwait(false).

petrroll Apr 23, 2026
Author

As i note in the dotnet runtime discussion above,

IO=>TP=>SC,

is a true if SOCKETS_INLINE_COMPLETIONS is false, which it is by default. But it might not be. Also there's no reason for the runtime to use normal threadpool for that IMHO if we ask nicely :)

mgravell Apr 23, 2026
Maintainer

Fair. I'll make sure I test V3 scenarios with+without that env-var.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SocketManager.Shared & lock contention vs SocketManager.ThreadPool and starvation - recommendation #3060

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 13 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

SocketManager.Shared & lock contention vs SocketManager.ThreadPool and starvation - recommendation #3060

Uh oh!

petrroll Apr 21, 2026

Replies: 1 comment · 13 replies

Uh oh!

mgravell Apr 21, 2026 Maintainer

Uh oh!

petrroll Apr 22, 2026 Author

Uh oh!

petrroll Apr 22, 2026 Author

Uh oh!

mgravell Apr 22, 2026 Maintainer

Uh oh!

petrroll Apr 23, 2026 Author

Uh oh!

mgravell Apr 23, 2026 Maintainer

petrroll
Apr 21, 2026

Replies: 1 comment 13 replies

mgravell
Apr 21, 2026
Maintainer

petrroll Apr 22, 2026
Author

petrroll Apr 22, 2026
Author

mgravell Apr 22, 2026
Maintainer

petrroll Apr 23, 2026
Author

mgravell Apr 23, 2026
Maintainer