Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 9, 2026

Description

File system enumeration was re-checking pattern types on every entry to determine if fast paths apply (*literal → EndsWith). This moves pattern analysis to enumerable creation time and creates specialized delegates upfront.

Changes:

  • Add GetPredicate<T> that analyzes patterns once and returns optimized FileSystemEnumerable<T>.FindPredicate delegates:
    • * → always-true predicate (with IsDirectory/IsFile check)
    • literalEquals for exact filename matching (e.g., "log.txt" across directory hierarchies)
    • *literalEndsWith (existing optimization, moved earlier)
    • literal*StartsWith
    • *literal*Contains
    • prefix*suffixStartsWith + EndsWith with minimum length check
  • Complex patterns fall back to full NFA-based matching
  • Extract wildcard constants as SearchValues<char> (s_simpleWildcards, s_extendedWildcards) to internal static readonly fields in FileSystemName for shared use
  • Combine IsDirectory/IsFile check with pattern matcher into a single delegate invocation (avoids double delegate call)
  • Capture only expression in lambdas and compute span slices inline on each invocation to avoid string allocations and minimize capture overhead
  • Switch over (useExtendedWildcards, entryType) tuple to return distinct delegates without capturing unnecessary variables
  • Restructured pattern matching logic to use switch statement on (startsWithStar, endsWithStar) tuple for cleaner code organization

Customer Impact

Performance improvement for file enumeration with common glob patterns. Each file entry now calls a simple string operation instead of re-evaluating pattern structure and potentially running the full matching algorithm.

Regression

No. This is a performance optimization.

Testing

  • All existing FileSystem tests pass
  • Added unit test cases in FileSystemNameTests.cs for literal patterns, literal* (StartsWith), *literal* (Contains), and prefix*suffix patterns
  • Added end-to-end integration tests in PatternTransformTests.cs that reference existing SimpleMatchData and Win32MatchData theory data from FileSystemNameTests, creating actual files and validating enumeration results for each pattern type through the full enumeration path including the optimized delegates

Risk

Low. The optimization preserves existing behavior by falling back to full pattern matching for any pattern not matching the fast paths. All existing tests pass unchanged.

Original prompt

When doing file system enumeration, every entry returned by the is is then checked to see whether it matches the caller-provided pattern. That's handled by a delegate that eventually calls into a Match Pattern routine. That routine has special-case, eg if the pattern is *literal it'll just do EndsWith. But that means that on every entry it's rechecking whether any of those fast paths apply. Instead, the entry points that create those delegates could create different delegates for each of the most important special-case, such that each call doesn't need to recheck the condition. It'll also be good to add additional special-case. For example, right now it special cases *literal to use EndsWith, but not literal* to use Starts With, and not *literal* to use Contains. Please optimize all of these things.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-io
See info in area-owners.md if you want to be subscribed.

Copilot AI and others added 2 commits January 9, 2026 20:06
…legates

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Copilot AI changed the title [WIP] Optimize file system enumeration pattern matching Optimize file system enumeration pattern matching with specialized delegates Jan 9, 2026
Copilot AI requested a review from stephentoub January 9, 2026 20:13
…, combine IsDirectory check with matcher

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
…in delegates

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
…ds, entryType) tuple

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
@stephentoub
Copy link
Member

@EgorBot -amd -intel -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]
[MemoryDiagnoser(false)]
public partial class Tests
{
    private string _directoryPath;

    [GlobalSetup]
    public void Setup()
    {
        _directoryPath = Path.GetTempFileName();
        File.Delete(_directoryPath);

        Directory.CreateDirectory(_directoryPath);
        for (int i = 0; i < 10_000; i++)
        {
            File.Create(Path.Combine(_directoryPath, $"file{i}.txt")).Dispose();
        }
    }

    [Benchmark]
    public int Sum1()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath)) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int Sum2()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "file*")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int Sum3()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*.txt")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int Sum4()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*12*")) sum += path.Length;
        return sum;
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        Directory.Delete(_directoryPath, true);
    }
}

@stephentoub stephentoub marked this pull request as ready for review January 11, 2026 22:38
Copilot AI review requested due to automatic review settings January 11, 2026 22:38
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Jan 12, 2026

@copilot, let's add the literal case where there are no wildcards, only literals
That can show in cases where eg a particular file (eg "log.txt") might show up in various levels of a directory hierar...

Added in 539b9a2. Literal patterns (no wildcards, no escape characters) now use a simple Equals call. This is useful for finding specific files like "log.txt" at various levels in a directory hierarchy.

@stephentoub
Copy link
Member

@EgorBot -amd -intel -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]
[MemoryDiagnoser(false)]
public partial class Tests
{
    private string _directoryPath;

    [GlobalSetup]
    public void Setup()
    {
        _directoryPath = Path.GetTempFileName();
        File.Delete(_directoryPath);

        Directory.CreateDirectory(_directoryPath);
        for (int i = 0; i < 10_000; i++)
        {
            File.Create(Path.Combine(_directoryPath, $"file{i}.txt")).Dispose();
        }
    }

    [Benchmark]
    public int NoPattern()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath)) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int LiteralStar()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "file*")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int StarLiteral()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*.txt")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int StarLiteralStar()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*12*")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int LiteralStarLiteral()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "12*.txt")) sum += path.Length;
        return sum;
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        Directory.Delete(_directoryPath, true);
    }
}

@stephentoub
Copy link
Member

@copilot

Copilot AI and others added 2 commits January 23, 2026 21:20
… default

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
…elds to s_camelCase

Co-authored-by: stephentoub <2642209+stephentoub@users.noreply.github.com>
@stephentoub
Copy link
Member

@EgorBot -amd -intel -arm

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkSwitcher.FromAssembly(typeof(Tests).Assembly).Run(args);

[HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]
[MemoryDiagnoser(false)]
public partial class Tests
{
    private string _directoryPath;

    [GlobalSetup]
    public void Setup()
    {
        _directoryPath = Path.GetTempFileName();
        File.Delete(_directoryPath);

        Directory.CreateDirectory(_directoryPath);
        for (int i = 0; i < 10_000; i++)
        {
            File.Create(Path.Combine(_directoryPath, $"file{i}.txt")).Dispose();
        }
    }

    [Benchmark]
    public int NoPattern()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath)) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int Literal()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "file123.txt")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int LiteralStar()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "file*")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int StarLiteral()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*.txt")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int StarLiteralStar()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "*12*")) sum += path.Length;
        return sum;
    }

    [Benchmark]
    public int LiteralStarLiteral()
    {
        int sum = 0;
        foreach (string path in Directory.EnumerateFiles(_directoryPath, "12*.txt")) sum += path.Length;
        return sum;
    }

    [GlobalCleanup]
    public void Cleanup()
    {
        Directory.Delete(_directoryPath, true);
    }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants