Skip to content

Fix VS allocation issue: NuGet restore content-file glob matching#7405

Open
nareshjo wants to merge 4 commits into
NuGet:devfrom
nareshjo:nareshjo-batchContentFileGlobMatching
Open

Fix VS allocation issue: NuGet restore content-file glob matching#7405
nareshjo wants to merge 4 commits into
NuGet:devfrom
nareshjo:nareshjo-batchContentFileGlobMatching

Conversation

@nareshjo
Copy link
Copy Markdown
Contributor

🤖 AI-generated PR by VS Perf Rel Agent — self-reviewed for correctness; please review with extra care.

Bug

Fixes: 2576115

Description

This PR addresses several of the top most allocation issues in recent VS releases, with sampled allocation rates for the highest at 546 MB/s (p50) and 1,072 MB/s (p90). All originate from the same code path in NuGet restore content-file matching and are resolved by a single localized fix.

Problem

ContentFileUtils.GetContentFileGroup calls Matcher.Match() N×M times (once per nuspec <contentFiles> pattern × content file), each internally creating a new InMemoryDirectoryInfo and MatcherContext. The MatcherContext allocates lists, arrays, hash sets, and strings for construction and directory traversal — all immediately discarded.

Allocation site tree derived from VS telemetry, filtered to types present in the current codebase (all addressed by this fix):

ContentFileUtils.GetContentFileGroup                  ← fix target
 ├─ [per-call allocations]                             (N×M → M, lifted)
 │   └─ String (MatcherRoot + relativePath)
 │
 └─ Matcher.Match → Matcher.Execute
     ├─ InMemoryDirectoryInfo                           (N×M → 1)
     ├─ MatcherContext..ctor                            (N×M → N)
     │   ├─ HashSet<String>
     │   ├─ IPatternContext[]
     │   ├─ List<FilePatternMatch>
     │   ├─ WhereSelectListIterator<IPattern, IPatternContext>
     │   └─ HashSet<LiteralPathSegment>
     └─ MatcherContext.Match                            (N×M → N)
         ├─ List<FileSystemInfoBase>
         ├─ List<DirectoryInfoBase>
         ├─ FileSystemInfoBase[]
         ├─ DirectoryInfoBase[]
         ├─ String (Concat)
         └─ Action<IPathSegment, Boolean>

Fix

Replace per-file Matcher.Match() with a single shared InMemoryDirectoryInfo containing all M files. Matcher.Execute is called N times (once per nuspec entry) instead of N×M. Matched paths are mapped back via dictionary lookup.

Note: InMemoryDirectoryInfo in FileSystemGlobbing 6.0.0 resolves file paths via Path.GetFullPath against CWD but stores rootDir as-is. A synthetic absolute root ("/_") ensures both sides are absolute. The root value doesn't affect matching results — FilePatternMatch.Path is built from relative path segments, independent of the root.

Impact of fix

Summary: At typical package sizes (N=1, M=50), allocations drop 79% per package with 4.8× fewer Gen0 collections. At upper end (N=5, M=200), 7.2 MB fewer allocations per package — across a solution restore touching 50 such packages, this eliminates ~360 MB of short-lived allocations and reduces Gen0 collections by .

Measured by benchmarking GetContentFileGroup in isolation with synthetic packages containing N <contentFiles> patterns (all **/*) and M content files across 10 subdirectories. BenchmarkDotNet [MemoryDiagnoser], .NET Framework 4.7.2, out-of-process:

N M Allocated Before Allocated After Reduction Gen0 Before Gen0 After Gen0 Reduction
1 10 86 KB 28 KB 67% 13.9 4.6 3.0×
1 50 424 KB 89 KB 79% 68.8 14.4 4.8×
1 200 1,701 KB 329 KB 81% 275.4 52.7 5.2×
2 10 165 KB 47 KB 72% 26.9 7.6 3.5×
2 50 819 KB 131 KB 84% 132.8 21.2 6.3×
2 200 3,283 KB 465 KB 86% 531.3 75.2 7.1×
5 10 405 KB 103 KB 75% 65.9 16.6 4.0×
5 50 2,010 KB 263 KB 87% 324.2 42.5 7.6×
5 200 8,046 KB 891 KB 89% 1,296.9 144.5 9.0×

N = nuspec <contentFiles> patterns, M = content files per package. Gen0 = GC Generation 0 collections per 1000 calls.

Safety

  • Same Matcher API, same OrdinalIgnoreCase semantics, same nuspec entry ordering (bottom-has-priority invariant preserved)
  • InMemoryDirectoryInfo is immutable — MatcherContext owns its own traversal state, so reuse across Execute calls is safe
  • Single internal static method, one production caller (LockFileUtils.AddContentFiles)
  • Tests: 2 new unit tests (path round-trip through InMemoryDirectoryInfo, multi-matcher reuse correctness) + 26 existing integration tests pass (globs, excludes, rule ordering, .pp transforms, _._ empty folders, TxM/language selection)

See related failures in PRISM

PR Checklist

  • Meaningful title, helpful description and a linked NuGet/Home issue
  • Added tests
  • Link to an issue or pull request to update docs if this PR changes settings, environment variables, new feature, etc.

@nareshjo nareshjo requested a review from a team as a code owner May 21, 2026 21:09
@nareshjo nareshjo requested review from jeffkl and martinrrm May 21, 2026 21:09
@dotnet-policy-service dotnet-policy-service Bot added the Community PRs created by someone not in the NuGet team label May 21, 2026
Comment thread src/NuGet.Core/NuGet.Commands/RestoreCommand/ContentFiles/ContentFileUtils.cs Outdated
jeffkl
jeffkl previously approved these changes May 26, 2026
nkolev92
nkolev92 previously approved these changes May 26, 2026
// Matcher.Execute once per nuspec entry instead of once per (entry × file), reducing
// MatcherContext + internal list/array allocations from O(N×M) to O(N).
var relativePathToEntries = new Dictionary<string, List<ContentFilesEntry>>(entryMappings.Count, StringComparer.OrdinalIgnoreCase);
foreach ((var file, var entries) in entryMappings)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

using type specific declarations here imprves redability.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switched to explicit types for that - the tuple deconstruction seems to be the main case where types weren't obvious from the rest of the line. Kept var for the others since the types are visible from new/method names there, which is consistent with rest of the file and codebase

@jeffkl jeffkl self-assigned this May 26, 2026
@nareshjo nareshjo dismissed stale reviews from nkolev92 and jeffkl via ef5f8bf May 28, 2026 17:50
@jeffkl jeffkl enabled auto-merge (squash) June 1, 2026 21:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Community PRs created by someone not in the NuGet team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants