Wasm r2 rtesting#127875
Draft
davidwrighton wants to merge 46 commits intodotnet:mainfrom
Draft
Conversation
…nk mappings This adds a new ReadyToRun fixup that enables mapping UTF-8 strings to pregenerated code thunks embedded in R2R images. The fixup is placed in the eager imports section and processed at module load time. Changes across all layers: Format definition: - Add READYTORUN_FIXUP_InjectStringThunks = 0x39 to readytorun.h and ReadyToRunConstants.cs - Bump R2R minor version from 5 to 6 in all three locations Runtime (CoreCLR VM): - Refactor StringThunkSHashTraits from wasm/helpers.cpp into shared stringthunkhash.h header, available to all platforms - Add pregeneratedstringthunks.cpp/.h with global hash table using copy-on-write CAS pattern for lock-free concurrent reads - InitializePregeneratedStringThunkHash() called at EE startup - LookupPregeneratedThunkByString() API returns PCODE or NULL - ProcessInjectStringThunksFixup() handles the fixup in LoadDynamicInfoEntry, merging new entries with existing ones Crossgen2 compiler: - Add abstract StringDiscoverableAssemblyStubNode (derives from AssemblyStubNode) with LookupString property; instances register themselves via OnMarked - Add InjectStringThunksSignature that collects all registered stubs at emission time and encodes them as (UTF8 string, RVA) pairs - Root the InjectStringThunks import eagerly in NodeFactory - Sort stubs by LookupString for deterministic compilation Tooling and documentation: - Add r2rdump parser case for InjectStringThunks signatures - Update readytorun-format.md with fixup table entry and format spec Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of unconditionally rooting the InjectStringThunks import, store it on the NodeFactory and have each StringDiscoverableAssemblyStubNode declare a dependency on it via ComputeNonRelocationBasedDependencies. The import is only pulled into the graph when at least one stub is marked. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Change GetSignature to return (WasmFuncType, string) where the string
is a compact serialization of the signature:
Return type: 'v' (void), 'i'/'l'/'f'/'d'/'V' (primitives), 'S<N>'
(struct by ref with N bytes).
Hidden params (this, retbuf, generic context, async continuation):
'i' or 'l' based on pointer size.
Explicit params: 'i'/'l'/'f'/'d'/'V' (by value), 'S<N>' (by ref),
'e' (empty struct, not emitted to WasmFuncType).
Suffix 'p' indicates SP and PE params are generated (managed calls).
Add IsEmptyStruct helper (stub returning false) for detecting empty
structs by field count per the BasicCABI spec. Handle empty structs
for both parameters ('e' encoding) and returns (treated as void).
See dotnet#127361.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Introduce WasmSignature readonly struct implementing IEquatable and IComparable. Equality and comparison are based on the signature string (with Debug.Assert that FuncType agrees when strings match). This enables sorting and deduplication of signatures by string alone. Update WasmLowering.GetSignature to return WasmSignature and update callers in WasmObjectWriter and ReadyToRunCodegenNodeFactory. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- WasmImportThunk now takes WasmSignature and uses it for mangled name and comparison operations - WasmImportThunkPortableEntrypoint uses static WasmSignature values - RaiseSignature rewritten to parse signature string instead of WasmFuncType - Added CompilerTypeSystemContext.Wasm.cs with GetValueTupleStructOfSize cache using tree-based ValueTuple construction - Unmanaged calling convention flag set when 'p' suffix is absent - Roundtrip assert: raised signature re-lowered must equal original - Cache first empty struct found during lowering for 'e' roundtrip Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of iterating the wasm-level _typeNode params, iterate the raised MethodSignature. This enables: - Indirect struct args: zero-fill the transition block slot on store, and pass the original byref local directly on restore - Empty struct args: skip entirely (no wasm local exists) - Made WasmLowering.IsEmptyStruct public for cross-file access Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Site api. Co-authored-by: Copilot <copilot@github.com>
The 'this' parameter is now encoded with a distinct 'T' character instead of 'i'/'l'. On raise, 'T' sets HasThis on the MethodSignature rather than adding an explicit parameter. This enables proper roundtripping and allows ArgIterator to correctly compute offsets (e.g. GetRetBuffArgOffset with hasThis). Also fix build errors in CorInfoImpl.ReadyToRun.cs: qualify LoweringFlags and cast getCallConv() to int. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add WasmR2RToInterpreterThunkNode, a StringDiscoverableAssemblyStubNode that captures arguments into a transition block and dispatches to the interpreter via READYTORUN_HELPER_InitInstClass. Key details: - Thunk keyed by WasmSignature, discoverable by 'I'-prefixed signature string - Arguments area is 16-byte aligned; TransitionBlock is 8-byte aligned - Indirect struct args copied with memory.copy + memory.fill padding - Stack pointer global saved/restored around helper call - V128 return uses 16-byte aligned buffer; others use 8-byte i64 store Also adds memory.copy, memory.fill, and i64.const WASM instructions, and updates WasmImportThunk to use memory.fill for indirect struct zero-filling. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…all site dependency - Add WasmInterpreterToR2RThunkNode: a StringDiscoverableAssemblyStubNode that bridges from interpreter calling convention to R2R compiled functions. Uses ArgIterator offsets (minus TransitionBlock size) to locate args in the interpreter buffer, sets up a TERMINATE_R2R_STACK_WALK frame, and dispatches via call_indirect. - Fix retbuf detection in both WasmR2RToInterpreterThunkNode and WasmInterpreterToR2RThunkNode to check SignatureString[0] == 'S' instead of using ArgIterator.HasRetBuffArg/GetRetBuffArgOffset. The R2R-to-interpreter thunk now passes the retbuf wasm local directly. - Add factory cache and accessor for WasmInterpreterToR2RThunk on ReadyToRunCodegenNodeFactory. - Fix recordCallSite TODO: wire up WasmR2RToInterpreterThunk dependency. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add AddAdditionalDependency helper for lazily adding to _additionalDependencies - Move WasmR2RToInterpreterThunk from AddPrecodeFixup to AddAdditionalDependency in recordCallSite - Add WasmInterpreterToR2RThunk dependency for every compiled managed non-UnmanagedCallersOnly method on Wasm, using GetSignature(MethodDesc) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <copilot@github.com>
- Replace ValueTuple-based struct size construction with a cache of real struct types encountered during GetSignature. ValueTuples have auto layout which causes padding, making roundtrip size assertions fail. The cache uses a locked Dictionary for thread safety. - Fix RaiseSignature to skip the hidden retbuf pointer parameter when the return type is a struct (S<N> encoding). Previously it was included in the raised MethodSignature parameters, causing GetSignature to emit a duplicate retbuf pointer on re-encoding. - Fix WasmImportThunk to handle 'this' pointer correctly: store/restore it separately before the explicit parameter loop, and start wasmLocalIndex past both 'this' and retbuf locals. - Fix WasmImportThunkPortableEntrypoint to strip IsUnmanagedCallersOnly flag when computing thunk signatures, since thunks always use managed calling convention. - Fix DelayLoadHelperImport to skip creating WasmImportThunk for GenericLookupSignature on WASM, as these are eager fixups that don't need import thunks. - Fix WasmR2RToInterpreterThunkNode to skip 'this' and retbuf wasm locals before iterating explicit parameters. - Skip creating R2R-to-interpreter thunks for unmanaged call sites, as they don't go through interpreter transitions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The ForceSigWalk method had two bugs in its Wasm-specific path for accounting unnamed arguments (this, retbuf, generic context, etc.) when no named arguments are present: 1. The check 'maxOffset == 0' could never be true because maxOffset is initialized to OffsetOfArgs (8 on Wasm32). Changed to compare against OffsetOfArgs. 2. The fallback 'maxOffset = _wasmOfsStack' was incorrect because _wasmOfsStack is relative to OffsetOfArgs, but maxOffset is an absolute offset. Changed to 'OffsetOfArgs + _wasmOfsStack'. These bugs caused GCRefMapBuilder to allocate a zero-length fake stack for methods with only unnamed arguments (e.g. parameterless instance methods), leading to IndexOutOfRangeException when writing the 'this' pointer GC ref at ThisOffset. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Replace 'n' encoding with 'S' for multi-field structs passed by ref - Add hardcoded struct sizes for QCallModule (8), QCallAssembly (8), GCHeapHardLimitInfo (64) so signatures produce S<N> format - Add ParseSignatureTokens tokenizer to handle multi-char S<N> tokens - Add Token-based API (TokenToNativeType/TokenToNameType/TokenToArgType) - Update InterpToNativeGenerator to use token-based parsing - Unknown struct types log a diagnostic at High importance Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- TokenToNameType returns the full S<N> token (e.g. S8, S64) so generated function names encode the struct size - ArgsWithSlotOffsets computes running slot indices: structs consume max(size/8, 1) slots instead of always 1 - Add TokenToSlotCount helper - Remove IsBlittable gate from TypeToChar — multi-field structs are always passed by pointer, matching crossgen2 WasmLowering behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- helpers.cpp: Refactor GetSignatureKey to support S<N> struct tokens,
LowerTypeHandle for recursive single-field unwrapping, caller prefix
parameter ('M' for calli, 'I' for PE-to-interpreter)
- helpers.cpp: Use 'T' for this pointer encoding (was 'i')
- WasmLowering.cs: Remove redundant hidden retbuf pointer from signature
string (implied by S<N> return type)
- RaiseSignature: Remove hasReturnBuffer skip logic (no longer in string)
- SignatureMapper.cs: Use 'T' for this pointer, add T to token maps
- InterpToNativeGenerator.cs: Add 'M' prefix to g_wasmThunks entries
- clr-abi.md: Document Type Lowering and Signature String Encoding spec
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…tion Replace dynamic alloca-based initial buffer sizing with a fixed 64-byte stack buffer. Fall back to alloca only when S<N> tokens make the key exceed the initial buffer. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…EntryPointThunk Both functions now check the process-startup thunk cache first, then fall back to LookupPregeneratedThunkByString for thunks injected via READYTORUN_FIXUP_InjectStringThunks. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a MethodDesc's PortableEntryPoint is initialized before the R2R module containing its thunk is loaded, the method is tracked on a per-LoaderAllocator SArray and resolved later when new R2R thunks are injected. - Add TrySetInterpreterThunk CAS-based thunk installation on PortableEntryPoint - Track pending methods per-LoaderAllocator using SArray<MethodDesc*> with NULL-compaction on resolve - Single global lock (s_pendingThunkResolutionLock) protects both the LA registry and per-LA pending arrays, keeping LAs alive during scans - Registration flag on LoaderAllocator avoids duplicate list scans - Unregistration in LoaderAllocator::Destroy for correct collectible cleanup - LookupThunk/LookupPortableEntryPointThunk now also check R2R thunk hash - Remove stale WASM-TODO comments Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…bleBase, not by the image base. Co-authored-by: Copilot <copilot@github.com>
Add FlagPendingThunkResolution on DynamicMethodDesc to track whether the method is already in the pending thunk resolution list. The flag is set/cleared using interlocked operations under s_pendingThunkResolutionLock, preventing unbounded growth from re-used LCG methods. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The pregenerated string thunk hash table, lookup, and pending resolution are only used on WASM. Guard them with TARGET_WASM, providing no-op stubs for InitializePregeneratedStringThunkHash and ProcessInjectStringThunksFixup on other platforms so callers remain unchanged. Also adds FlagPendingThunkResolution on DynamicMethodDesc with interlocked set/clear to prevent duplicate pending entries from reused LCG methods. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Instead of logging a message and producing an invalid signature, emit WASM0067 error and return null so the build fails with a clear diagnostic pointing at the missing entry in s_knownStructSizes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Adjust when the interpreter thunks are attached, previously they triggered unsafe recursion in the type loader now they are attached in GetMultiCallableAddr, and when ExternalMethodFixupWorker finishes. - Adjust lock to respect the new type load dependency of the signature walk - This should cover the existing R2R usage, as R2R code does not directly dispatch on virtual functions - It also will cause more resolution of the interpreter thunks than necessary, as the interpreter codepath calls GetMultiCallableAddr often, but that could possibly be tweaked to go down a special path for scenarios where the acquired pointer is directly used to dispatch to more interpreted code.
- Fix dependency generation for InterpreterToR2R thunks - Both for the WasmTypeNode of the thunk - And for referencing the WasmInterpreterToR2R thunks - Fix InjectStringThunksSignature to use a table index relative to the tableBase. Add a new reloc to make that possible
Co-authored-by: Copilot <copilot@github.com>
…that it doesn't trigger on unmanaged entrypoints
- Adjust the Pending portable entrypoint thunk logic to be a MethodDesc property not a DynamicMethodDesc property - Handle TypedByReference in LowerTypeHandle Co-authored-by: Copilot <copilot@github.com>
R2RDump previously could not read Webcil files (the format used for managed assemblies in WebAssembly environments). This adds a WebcilImageReader that implements IBinaryImageReader for the Webcil format, enabling R2RDump to dump headers, methods, and section contents from Webcil-format R2R images. Changes: - New WebcilImageReader.cs implementing IBinaryImageReader - ReadyToRunReader detects Webcil format (after MachO, before PE) - DumpModel handles Webcil in reference assembly loading - Program.cs maps OperatingSystem.Unknown to TargetOS.Linux for Webcil - ReadyToRunMethod gracefully handles null PEReader (Webcil has no PE) - ILCompiler.Reflection.ReadyToRun.csproj includes shared Webcil.cs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the PEReader ImageReader property with a GetSectionData(int rva) method that returns a BlobReader. This decouples the interface from PEReader, enabling non-PE formats (Webcil) to provide section data. Implementations: - StandaloneAssemblyMetadata: delegates to PEReader.GetSectionData - ManifestAssemblyMetadata: same with null-guard - WebcilAssemblyMetadata: resolves RVA via WebcilImageReader sections - SimpleAssemblyMetadata (tests): delegates to PEReader.GetSectionData Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement a full WASM instruction disassembler that decodes WebAssembly binary format into WAT-style text output. This enables the --disasm flag in R2RDump to work with Webcil/WASM R2R images. - Add WasmDisassembler.cs with complete opcode tables for all standard WASM instructions (control, parametric, variable, table, memory, numeric, conversion, sign-extension, reference types) plus 0xFC (bulk memory/saturating truncation), 0xFB (GC), and 0xFD (SIMD) prefixed opcodes - Add WebcilImageReader.GetWasmFunctionBody() to parse the WASM module's type, function, and code sections to extract function info including type signature and local declarations - Integrate into TextDumper.DumpWasmDisasm() to print parameters and locals with their local indices, result types, and disassembled instructions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
WebcilAssemblyMetadata was not retaining a reference to the pinned metadata byte array passed to its constructor. After GetStandaloneAssemblyMetadata returned, the array could be collected by the GC despite being allocated on the Pinned Object Heap, since no live reference existed. This caused an AccessViolationException when MetadataReader accessed the freed memory on larger files like system.private.corelib.wasm. Fix: store the metadata byte array in a field to keep it rooted for the lifetime of the MetadataReader. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
WASM R2R images should use .wasm extension instead of .dll. Update CLRTest.CrossGen.targets to: - Set output extension to .wasm for both composite and non-composite modes in bash and batch scripts when CrossGen2OutputFormat is 'wasm' - Pass -f flag to crossgen2 in batch scripts (matching bash behavior) Also set CrossGen2OutputFormat=wasm in Directory.Build.props for browser target OS so all tests targeting wasm use the correct format. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
This was referenced May 6, 2026
Contributor
|
Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.