-
-
Notifications
You must be signed in to change notification settings - Fork 97
enhance: Performance improvements from analyze deopts #3703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🦋 Changeset detectedLatest commit: a1e9f29 The changes in this PR will be included in the next version bump. This PR includes changesets to release 14 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3703 +/- ##
==========================================
+ Coverage 98.12% 98.13% +0.01%
==========================================
Files 150 150
Lines 2715 2735 +20
Branches 536 537 +1
==========================================
+ Hits 2664 2684 +20
Misses 11 11
Partials 40 40 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Benchmark
Details
| Benchmark suite | Current: a1e9f29 | Previous: d3ec122 | Ratio |
|---|---|---|---|
normalizeLong |
439 ops/sec (±1.20%) |
457 ops/sec (±0.72%) |
1.04 |
normalizeLong Values |
407 ops/sec (±0.33%) |
29.31 ops/sec (±1.37%) |
0.072014742014742 |
denormalizeLong |
291 ops/sec (±2.68%) |
292 ops/sec (±3.06%) |
1.00 |
denormalizeLong Values |
267 ops/sec (±1.95%) |
26.95 ops/sec (±1.55%) |
0.10 |
denormalizeLong donotcache |
1045 ops/sec (±1.40%) |
1029 ops/sec (±0.14%) |
0.98 |
denormalizeLong Values donotcache |
771 ops/sec (±0.16%) |
31.22 ops/sec (±0.52%) |
0.040492866407263295 |
denormalizeShort donotcache 500x |
1582 ops/sec (±0.14%) |
1604 ops/sec (±0.09%) |
1.01 |
denormalizeShort 500x |
859 ops/sec (±2.06%) |
864 ops/sec (±2.22%) |
1.01 |
denormalizeShort 500x withCache |
6689 ops/sec (±0.28%) |
6437 ops/sec (±0.17%) |
0.96 |
queryShort 500x withCache |
2654 ops/sec (±0.12%) |
2765 ops/sec (±0.29%) |
1.04 |
buildQueryKey All |
54854 ops/sec (±0.59%) |
54735 ops/sec (±0.23%) |
1.00 |
query All withCache |
6730 ops/sec (±0.87%) |
7115 ops/sec (±0.17%) |
1.06 |
denormalizeLong with mixin Entity |
284 ops/sec (±2.29%) |
284 ops/sec (±2.33%) |
1 |
denormalizeLong withCache |
8130 ops/sec (±0.25%) |
7763 ops/sec (±0.18%) |
0.95 |
denormalizeLong Values withCache |
5067 ops/sec (±0.09%) |
5143 ops/sec (±0.15%) |
1.01 |
denormalizeLong All withCache |
6544 ops/sec (±0.12%) |
7080 ops/sec (±0.12%) |
1.08 |
denormalizeLong Query-sorted withCache |
6725 ops/sec (±1.11%) |
7074 ops/sec (±0.15%) |
1.05 |
denormalizeLongAndShort withEntityCacheOnly |
1800 ops/sec (±0.16%) |
1794 ops/sec (±0.66%) |
1.00 |
getResponse |
4773 ops/sec (±0.71%) |
4595 ops/sec (±0.80%) |
0.96 |
getResponse (null) |
10370152 ops/sec (±1.22%) |
6436338 ops/sec (±0.69%) |
0.62 |
getResponse (clear cache) |
269 ops/sec (±2.32%) |
273 ops/sec (±2.06%) |
1.01 |
getSmallResponse |
3427 ops/sec (±0.08%) |
3183 ops/sec (±0.30%) |
0.93 |
getSmallInferredResponse |
2473 ops/sec (±1.13%) |
2363 ops/sec (±0.09%) |
0.96 |
getResponse Collection |
4606 ops/sec (±0.26%) |
4661 ops/sec (±0.51%) |
1.01 |
get Collection |
4626 ops/sec (±0.16%) |
4596 ops/sec (±0.28%) |
0.99 |
get Query-sorted |
5308 ops/sec (±0.11%) |
5263 ops/sec (±0.19%) |
0.99 |
setLong |
447 ops/sec (±0.18%) |
455 ops/sec (±0.18%) |
1.02 |
setLongWithMerge |
259 ops/sec (±0.23%) |
260 ops/sec (±0.15%) |
1.00 |
setLongWithSimpleMerge |
275 ops/sec (±0.12%) |
270 ops/sec (±0.51%) |
0.98 |
setSmallResponse 500x |
945 ops/sec (±0.13%) |
942 ops/sec (±0.09%) |
1.00 |
This comment was automatically generated by workflow using github-action-benchmark.
|
Size Change: +88 B (+0.11%) Total Size: 79.9 kB
ℹ️ View Unchanged
|
64bb14b to
c582bbb
Compare
Add isolated benchmarks for 6 optimization patterns: 1. forEach vs indexed for loop 2. reduce+spread vs direct mutation 3. array.map vs pre-allocated loop 4. repeated getter vs cached 5. slice+map vs pre-allocated extraction 6. Map double-get vs single-get
Replace Object.keys().forEach() and for...of patterns with indexed for loops in normalize/denormalize functions. V8 optimization impact: - Eliminates function call overhead per iteration - Allows TurboFan to inline the loop body directly - Avoids closure creation for the callback - More predictable control flow for branch prediction Benchmark: ~2% improvement on forEach pattern in isolation. Bundlesize: +10-20 bytes (neutral after minification)
Replace Object.keys().reduce() with object spreading on each iteration with direct object mutation using indexed for loops. V8 optimization impact: - Spreading creates a new object on every iteration → O(n²) allocations - Direct mutation is O(n) with no intermediate objects - Reduces GC pressure significantly - Avoids megamorphic property access patterns from spread Benchmark: 8x improvement (912 → 7,468 ops/sec) - highest impact change. Bundlesize: -20-40 bytes (no spread operator)
Cache this.isSingleSchema getter result in a local variable instead of calling it multiple times within normalizeValue/denormalizeValue. V8 optimization impact: - Getter invocation has function call overhead each time - Caching in local variable allows register allocation - Eliminates repeated property lookup + getter dispatch - Particularly impactful when getter has any computation Benchmark: 2.7x improvement (1,652,211 → 4,426,994 ops/sec) for repeated getter access patterns. Bundlesize: +5-15 bytes (const declaration)
Consolidate repeated rest.slice().map(ensurePojo) pattern into a single extractStateAndArgs() helper using pre-allocated indexed loop. V8 optimization impact: - slice() creates intermediate array allocation - map() creates another array + has callback overhead - Combined: 2 allocations + n function calls → 1 allocation + inline loop - Combines benefits of forEach→forLoop and array pre-allocation Benchmark: 1.65x improvement (33,221 → 54,701 ops/sec) Bundlesize: Neutral (code consolidation offsets loop expansion)
Cache Map.get() result in local variable instead of calling get() twice (once for check, once for retrieval). V8 optimization impact: - Avoids duplicate hash computation + bucket lookup - Local variable allows register allocation - Better branch prediction (single conditional path) - ~2x fewer Map operations in the cache-miss case Benchmark: ~1% improvement in isolation (marginal but free). Bundlesize: -5-10 bytes (fewer get calls)
Motivation
Faster is better
Solution
Microbenchmark Results (5 Remaining Optimizations)
3map → prealloc1,305,1101,325,298+1.5%REMOVEDUpdated Bundlesize Impact
Improvement from removing prealloc: Saved 34 bytes gzip (was +116, now +82).
Impact Summary by Codepath
Final Optimization Ranking