Average microbenchmarks results#5229
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR refactors the microbenchmark analysis/presentation pipeline, adds configurable iteration support, and fixes a few correctness issues in suite execution and output formatting.
Changes:
- Rename/standardize “Microbenchmarks” suite key + folder naming and add an
iterationsconfiguration surface. - Replace the removed microbenchmark analyzer/presenter with a new comparison pipeline that loads BDN JSON, (optionally) correlates traces, removes outliers, and emits Markdown/JSON.
- Fix suite base-path validation and improve console output (MarkupLine).
Reviewed changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/RunSuiteCommand.cs | Fixes suite path validation logic; switches to MarkupLine for errors. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/RunCommand.cs | Updates configuration key for microbenchmarks suite path. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/CreateSuiteCommand.cs | Writes new key/folder name; plumbs iterations into microbenchmark env config. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/BaseSuite/MicrobenchmarksToRun.txt | Updates base suite benchmark list. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/RunCommand/BaseSuite/Microbenchmarks.yaml | Renames iteration to iterations. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/Microbenchmark/MicrobenchmarkCommand.cs | Switches to new JSON model + analysis/presentation path; uses iterations. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure/Commands/Microbenchmark/MicrobenchmarkAnalyzeCommand.cs | Adds reusable ExecuteAnalysis + Present helpers. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Presentation.cs | Removes old presentation entry point. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Markdown.cs | Updates reporting for grouped comparisons and additional metrics; changes table generation. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Json/Json.cs | Moves JSON generator into main presentation namespace + serializes grouped results. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/Microbenchmarks/Json/JsonOutput.cs | Removes unused placeholder type. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Presentation/MarkdownReportBuilder.cs | Deduplicates repro steps by distinct command line. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Configurations/Microbenchmarks.Configuration.cs | Introduces iterations + YAML alias for legacy iteration. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Configurations/InputConfiguration.cs | Adds iterations input dictionary. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResultsAnalyzer.cs | Removes old analysis implementation. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResults.cs | Replaces old BDN JSON model with BdnJsonResult and non-nullable stat fields. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResultComparison.cs | Adds new load/analyze/compare pipeline including JSON↔trace mapping. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkResult.cs | New result type aggregating metrics/statistics (and optional trace metrics). |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkComparisonResults.cs | Adjusts regression bucketing boundaries (>=, <=) and ordering. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkComparisonResult.cs | Major refactor for multi-sample comparisons + outlier removal + other-metric deltas. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetrics.cs | Adds trace-derived metric extraction container. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetricComparisonResult.cs | Adds comparison result for trace metrics with outlier removal. |
| src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/GCTraceMetricComparison.cs | Adds helper for building trace metric comparisons. |
| src/benchmarks/gc/GC.Infrastructure/GC.Analysis.API/Statistics.cs | Adds RemoveOutliers helper. |
| src/benchmarks/gc/GC.Infrastructure/Configurations/Run.yaml | Adds iterations configuration section. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| sw.WriteLine($"| Large Improvements (>20%) | {GoodLinq.Sum(comparisonResults, s => s.LargeImprovements.Count())}|"); | ||
| sw.WriteLine($"| Total | {comparisonResults.Count} |"); | ||
| sw.WriteLine($"| ----- | {string.Join("|", Enumerable.Repeat(" ----- ", comparisonResultsCollection.Count))} |"); | ||
| sw.WriteLine($"| Large Regressions (>20%) | {API.GoodLinq.Sum(comparisonResultsCollection, s => s.LargeRegressions.Count())}|"); |
There was a problem hiding this comment.
The ranges mentioned in the column titles don't seem to match the ones in the src/benchmarks/gc/GC.Infrastructure/GC.Infrastructure.Core/Analysis/Microbenchmarks/MicrobenchmarkComparisonResults.cs. I can see you have modified e.g. >20% to >=20%. So it would be nice to change the column names here too.
| DOTNET_GCName: clrgc.dll | ||
|
|
||
| iterations: | ||
| gcperfsim: 1 |
There was a problem hiding this comment.
The gcperfsim iteration count doesn't do anything.
This PR aims at calculating average value of multiple microbenchmarks results. The work revolves around: