Add Dictionary and HashSet Remove/Contains benchmarks#5178
Add Dictionary and HashSet Remove/Contains benchmarks#5178danmoseley wants to merge 4 commits intodotnet:mainfrom
Conversation
… coverage Add focused benchmarks for Dictionary and HashSet operations at sizes 512 and 8192 (past L1/L2 cache) with int, string, and Guid type args: - Remove/RemoveTrue.cs: steady-state remove hit (remove + re-add) - Remove/RemoveFalse.cs: remove miss (absent keys) - Dictionary/DictionaryTryRemove.cs: Remove(key, out value) overload - Contains/HashSetContains.cs: HashSet Contains hit and miss - Contains/DictionaryContainsKey.cs: Dictionary ContainsKey hit and miss Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Ideally existing scenarios would test 8192 eg., but that would add expense, as alluded above; and the codepath for Add is largely shared with Contains/Remove so it will be partly protected by those new ones here |
|
Note This comment was generated with AI (Copilot CLI) assistance. Local A/B benchmark resultsRan these new benchmarks locally comparing baseline (parent of dotnet/runtime#125884, commit Both builds were full Remove hit (the target scenario for the PR)
*Assumed noise — PR #125884 only optimizes value-type key paths; string keys should be unaffected. Remove miss
Dictionary.Remove(key, out value) overload
Contains (control group ΓÇö PR doesn't change lookup paths)
Summary
|
|
It's a bit of a shame that we don't test collections above a few thousand entries. We surely have users that depend on good performance at much higher counts (lookup at least). Something to think about separarately from this PR, which follows existing patterns more or less and covers what we need for the product PR's above. |
Probes 512 keys into a 1M-entry dictionary per invocation to measure lookup behavior when the hash table far exceeds CPU cache, while keeping per-call time low enough for stable BDN statistics (~0.5% noise). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds targeted microbenchmarks for Dictionary<TKey, TValue> and HashSet<T> Remove/Contains scenarios (including Guid keys and larger collection sizes) to better detect performance changes/regressions in these hot paths.
Changes:
- Add steady-state
Removehit/miss benchmarks forHashSet<T>andDictionary<TKey, TValue>at sizes 512 and 8192. - Add
Dictionary<TKey, TValue>.Remove(key, out value)(TryRemove-style) benchmark. - Add focused
Contains/ContainsKeybenchmarks (including a large 1M-entry dictionary case).
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/benchmarks/micro/libraries/System.Collections/Remove/RemoveTrue.cs | New steady-state “remove hit” benchmarks for HashSet<T> / Dictionary<TKey,TValue>. |
| src/benchmarks/micro/libraries/System.Collections/Remove/RemoveFalse.cs | New “remove miss” benchmarks for HashSet<T> / Dictionary<TKey,TValue>. |
| src/benchmarks/micro/libraries/System.Collections/Dictionary/DictionaryTryRemove.cs | New benchmark for Dictionary.Remove(key, out value) (hit path currently). |
| src/benchmarks/micro/libraries/System.Collections/Contains/HashSetContains.cs | New focused HashSet<T>.Contains hit/miss benchmarks (includes Guid). |
| src/benchmarks/micro/libraries/System.Collections/Contains/DictionaryContainsKey.cs | New focused Dictionary.ContainsKey hit/miss benchmarks + 1M-entry large dictionary variant. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
src/benchmarks/micro/libraries/System.Collections/Dictionary/DictionaryTryRemove.cs
Show resolved
Hide resolved
src/benchmarks/micro/libraries/System.Collections/Dictionary/DictionaryTryRemove.cs
Outdated
Show resolved
Hide resolved
src/benchmarks/micro/libraries/System.Collections/Remove/RemoveTrue.cs
Outdated
Show resolved
Hide resolved
512 probes fit in L1 cache, masking the large-table effect. 8192 probes (~512 KB working set) push into L2/L3, giving realistic cache-miss behavior with ~1-2% noise. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After a successful Remove, use Add() instead of the indexer to re-add the entry. Add is semantically clearer (key is known absent), fails loudly if the remove didn't work, and avoids the indexer's extra exists-check overhead that can dilute the Remove signal. Matches the HashSet path which already uses Add. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
@DrewScoggins ci broken? |
Note
This PR description was AI/Copilot-generated.
Summary
Adds microbenchmark coverage for
DictionaryandHashSetRemoveandContainsoperations, filling gaps identified while reviewing dotnet/runtime#125884 (Dictionary.Remove value-type optimization) and dotnet/runtime#125893 (HashSet bounds check elimination).Existing coverage gaps
The existing cross-collection benchmarks (
ContainsTrue,ContainsFalse,ContainsKeyTrue,ContainsKeyFalse) cover hit/miss Contains for int and string at a single size (512). There are no existing Remove benchmarks for Dictionary or HashSet upstream. TheAddRemoveSteadyStatebenchmark measures combined add+remove throughput but doesn't isolate Remove hit/miss paths.This means there was no way to measure the impact of the runtime PRs above on Remove codepaths, and limited ability to detect regressions in Contains for larger-than-cache collections or Guid keys (which exercise different hash/equality paths).
Rather than adding sizes/types to the existing cross-collection benchmarks (which cover many collection types and would multiply scenario count significantly), the new Contains benchmarks are focused on just Dictionary and HashSet with the additional sizes and Guid key type.
New benchmarks
Remove (cross-collection pattern matching existing Add/Contains structure):
RemoveTrueΓÇö steady-state remove hit (remove + re-add) for HashSet and DictionaryRemoveFalseΓÇö remove miss (absent keys) for HashSet and DictionaryDictionary-specific:
DictionaryTryRemoveΓÇöRemove(key, out value)overload, hit and miss pathsContains (supplements existing cross-collection Contains benchmarks):
HashSetContainsΓÇö Contains hit and miss with Guid keysDictionaryContainsKeyΓÇö ContainsKey hit and miss with Guid keysCoverage
Large dictionary benchmark (DictionaryContainsKeyLarge)
Also adds a
DictionaryContainsKeyLargebenchmark that measures ContainsKey on a 1M-entry dictionary (~20 MB, far exceeding L1/L2 cache). It probes 8192 keys per invocation into the 1M-entry table for realistic cache-miss pressure, while keeping per-call time low enough for stable BDN statistics (~1-2% noise). Naively looping all 1M keys per call yielded ~3.4% StdDev due to accumulated DRAM latency variance. Int keys only, since the goal is to isolate cache-miss behavior rather than hash function cost.