JIT: Create suspensions/resumptions separately from processing calls in async transformation#125497
JIT: Create suspensions/resumptions separately from processing calls in async transformation#125497jakobbotsch wants to merge 9 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR refactors CoreCLR JIT async transformation so it first collects continuation requirements (live locals, optional fields, return shapes) and then materializes suspension/resumption blocks and continuation layouts afterwards, enabling future sharing/reuse of continuation layouts across multiple suspension points.
Changes:
- Add a
constoverload forjitstd::vector::data()to support const access patterns introduced by the refactor. - Introduce
ContinuationLayoutBuilder,AsyncState, and new return-shape tracking (ReturnTypeInfo/ReturnInfo) to decouple “what’s needed” from “how it’s laid out/encoded”. - Delay creation of suspension/resumption IR until all states are recorded (
CreateResumptionsAndSuspensions), and update layout/materialization logic to use per-state “sub-layout” membership checks.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
src/coreclr/jit/jitstd/vector.h |
Adds const T* data() const needed for new const usage patterns (e.g., binary search over const vectors). |
src/coreclr/jit/async.h |
Introduces builder/state/layout refactor types (ContinuationLayoutBuilder, AsyncState, return info structs) and updates AsyncTransformation plumbing accordingly. |
src/coreclr/jit/async.cpp |
Implements the new builder-based workflow, delayed IR creation, return-slot handling, and layout finalization (including GC bitmap/type allocation). |
src/coreclr/jit/async.cpp
Outdated
| printf("%*s +%03u Keep alive object\n", indent, "", KeepAliveOffset); | ||
| } | ||
|
|
||
| if (ExecutionContextOffset) |
There was a problem hiding this comment.
In ContinuationLayout::Dump the check if (ExecutionContextOffset) is inconsistent with the other offset fields (which use != UINT_MAX). As written, it will not print the ExecutionContext field when the offset is 0, and it will print a bogus offset if the sentinel is UINT_MAX. Consider changing this to an explicit sentinel check (e.g., ExecutionContextOffset != UINT_MAX).
| if (ExecutionContextOffset) | |
| if (ExecutionContextOffset != UINT_MAX) |
src/coreclr/jit/async.cpp
Outdated
| call->gtReturnType == TYP_STRUCT ? layout.ReturnStructLayout->GetClassName() | ||
| : varTypeName(call->gtReturnType), | ||
| layout.ReturnSize, layout.ReturnValOffset); | ||
| ret.Offset = allocLayout(ret.Alignment, ret.Size); |
There was a problem hiding this comment.
ContinuationLayoutBuilder::Create now allocates each return slot using allocLayout(ret.Alignment, ret.Size). Previously the return slot used heap/pointer alignment (min(returnAlignment, TARGET_POINTER_SIZE)), which matches the general continuation-field rule that the leading fields are pointer-sized and avoids assuming > pointer alignment inside an object. If getClassAlignmentRequirement returns 16 (e.g., some SIMD/structs), this will change subsequent offsets and can break any code that computes the result slot assuming pointer alignment. Consider using ret.HeapAlignment() for the allocation alignment (as is done for locals via inf.HeapAlignment()).
| ret.Offset = allocLayout(ret.Alignment, ret.Size); | |
| ret.Offset = allocLayout(ret.HeapAlignment(), ret.Size); |
src/coreclr/jit/async.h
Outdated
| ContinuationLayout LayOutContinuation(const ContinuationLayoutBuilder* layoutBuilder); | ||
|
|
There was a problem hiding this comment.
AsyncTransformation::LayOutContinuation(const ContinuationLayoutBuilder*) is still declared but appears to be unused and no longer has a definition (layout creation moved to ContinuationLayoutBuilder::Create). This stale declaration is confusing and can turn into a link error if referenced later; consider removing it or updating it to the new API shape.
| ContinuationLayout LayOutContinuation(const ContinuationLayoutBuilder* layoutBuilder); |
| bool ContinuationLayoutBuilder::ContainsLocal(unsigned lclNum) const | ||
| { | ||
| return BinarySearch(m_locals.data(), (int)m_locals.size(), lclNum) != nullptr; | ||
| } |
There was a problem hiding this comment.
ContinuationLayoutBuilder::ContainsLocal casts m_locals.size() from size_t to int when calling BinarySearch. If this ever exceeds INT_MAX, the count will truncate and the search becomes incorrect. Consider adding an assert that the size fits in int, or switching to a binary search helper that takes a size_t/unsigned count.
| BasicBlock* AsyncTransformation::CreateSuspension( | ||
| BasicBlock* block, GenTreeCall* call, unsigned stateNum, AsyncLiveness& life, const ContinuationLayout& layout) | ||
| BasicBlock* AsyncTransformation::CreateSuspensionBlock(BasicBlock* block, GenTreeCall* call, unsigned stateNum) | ||
| { |
There was a problem hiding this comment.
CreateSuspensionBlock (and similarly CreateResumptionBlock) takes parameters like call/layoutBuilder that are not used in the implementation. If warnings for unused parameters are enabled in some build configurations, this can produce build breaks; consider removing the unused parameters or explicitly marking them unused.
| { | |
| { | |
| (void)call; |
| GenTreeCall* call, | ||
| unsigned stateNum, | ||
| ContinuationLayoutBuilder* layoutBuilder) | ||
| { |
There was a problem hiding this comment.
CreateResumptionBlock currently doesn't use some of its parameters (e.g., call and layoutBuilder). If unused-parameter warnings are enabled, this can cause build breaks; consider removing these parameters or explicitly marking them unused.
| { | |
| { | |
| (void)call; | |
| (void)layoutBuilder; |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
src/coreclr/jit/async.cpp:1
ContinuationLayoutBuilder::Create()declares a localClassLayout* layout;inside the locals loop, which shadows the outerContinuationLayout* layoutvariable. This is very error-prone and (depending on the exact code as compiled) can lead to passing the wronglayoutpointer type/value intobitmapBuilder.SetType(...), producing an incorrect GC bitmap (or even a compile-time type mismatch if overloads differ). Rename one of these variables (e.g.,ContinuationLayout* contLayoutfor the outer one andClassLayout* structLayoutfor the per-local layout), and ensureSetTypereceives the per-localClassLayout*for structs (andnullptrotherwise).
// Licensed to the .NET Foundation under one or more agreements.
src/coreclr/jit/async.cpp:1
FillInDataOnSuspensiongates OSR IL offset emission on the compiler OSR/patchpoint checks instead of the already-computedsubLayout.NeedsOSRILOffset()(used elsewhere for flags). UsingsubLayout.NeedsOSRILOffset()here would keep the logic consistent with the builder’s decisions and reduce the chance of future divergence as layout-sharing strategies evolve.
// Licensed to the .NET Foundation under one or more agreements.
| struct ReturnInfo | ||
| { | ||
| ReturnTypeInfo Type; | ||
| unsigned Alignment; | ||
| unsigned Offset; | ||
| unsigned Size; | ||
|
|
||
| ReturnInfo(ReturnTypeInfo type) | ||
| : Type(type) | ||
| { | ||
| } | ||
|
|
||
| unsigned HeapAlignment() const | ||
| { | ||
| return std::min(Alignment, (unsigned)TARGET_POINTER_SIZE); | ||
| } | ||
| }; |
There was a problem hiding this comment.
ReturnInfo leaves Alignment, Offset, and Size uninitialized until ContinuationLayoutBuilder::Create() fills them in. This is easy to misuse later (e.g., calling HeapAlignment() before initialization becomes undefined behavior). Consider default-initializing these members (e.g., Alignment = 0, Offset = UINT_MAX, Size = 0) to make the type safer and self-documenting.
| for (const ReturnTypeInfo& ret : m_returns) | ||
| { | ||
| if ((ret.ReturnType == info.ReturnType) && (ret.ReturnLayout == info.ReturnLayout)) | ||
| { | ||
| // This return type is already in the layout, no need to add another slot for it. | ||
| return; | ||
| } | ||
| } | ||
|
|
||
| m_returns.push_back(info); |
There was a problem hiding this comment.
AddReturn does a linear scan for deduplication. If this builder is used to accumulate returns across many suspension points (which the PR description suggests is a goal), this becomes O(n²) over the number of distinct returns. Using a small hash set keyed by (var_types, class handle/layout) (or maintaining m_returns in a canonical sorted order and doing binary search) would keep this scalable while still producing deterministic layouts.
| for (const ReturnTypeInfo& ret : m_returns) | |
| { | |
| if ((ret.ReturnType == info.ReturnType) && (ret.ReturnLayout == info.ReturnLayout)) | |
| { | |
| // This return type is already in the layout, no need to add another slot for it. | |
| return; | |
| } | |
| } | |
| m_returns.push_back(info); | |
| // Maintain m_returns in a canonical sorted order by (ReturnType, ReturnLayout) | |
| // and use binary search to efficiently detect duplicates. | |
| auto compare = [](const ReturnTypeInfo& lhs, const ReturnTypeInfo& rhs) { | |
| if (lhs.ReturnType < rhs.ReturnType) | |
| { | |
| return true; | |
| } | |
| if (lhs.ReturnType > rhs.ReturnType) | |
| { | |
| return false; | |
| } | |
| return lhs.ReturnLayout < rhs.ReturnLayout; | |
| }; | |
| auto it = std::lower_bound(m_returns.begin(), m_returns.end(), info, compare); | |
| if (it != m_returns.end()) | |
| { | |
| // If neither object is less than the other under the comparator, | |
| // they are equivalent with respect to (ReturnType, ReturnLayout). | |
| const ReturnTypeInfo& existing = *it; | |
| if ((existing.ReturnType == info.ReturnType) && (existing.ReturnLayout == info.ReturnLayout)) | |
| { | |
| // This return type is already in the layout, no need to add another slot for it. | |
| return; | |
| } | |
| } | |
| m_returns.insert(it, info); |
|
cc @dotnet/jit-contrib PTAL @AndyAyersMS Minor diffs. Only because some things are happening in different orders and that results in different local allocation order, leading to different stack frame encodings. Otherwise this shouldn't have functional changes. Looks like superpmi-diffs is broken. I can take a look tomorrow. |
This refactors the async transformation to collect all continuation information before generating the suspensions/resumptions themselves. This allows reusing continuation layouts across multiple suspension points and will allow experimenting with strategies where we only allocate one continuation, and then reuse it for all future suspensions.
Some minor diffs expected where locals are allocated in different orders, leading to different stack frame encodings.