-
Notifications
You must be signed in to change notification settings - Fork 5.1k
[release/9.0-staging] Fix SysV first/second return register GC info mismatch #116206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[release/9.0-staging] Fix SysV first/second return register GC info mismatch #116206
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
Backport fix for SysV first/second return register GC info mismatch on linux-x64 and osx-x64.
- Added a new regression test to reproduce and guard against the GC info misreport.
- Updated
genCallInstruction
to swap return-size metadata when the second SysV return register is an integer, ensuring the GC map reflects the true register holding the object reference.
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
File | Description |
---|---|
src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.csproj | Added project file to include the new regression test in the test suite. |
src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.cs | Introduced a stress test that repeatedly calls the problematic pattern to catch misreported GC info. |
src/coreclr/jit/codegenxarch.cpp | Modified genCallInstruction to detect when the second return ABI register is REG_INTRET and swap retSize /secondRetSize accordingly. |
Comments suppressed due to low confidence (1)
src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.cs:12
- The test currently only ensures no crash occurs. Consider adding assertions inside the loops to verify that
GetValue(j)
returns the expectedContainer.Name
anddouble
value, which would catch logical regressions beyond GC misreporting.
public static void TestEntryPoint()
// The emitter has hardcoded belief that retSize corresponds to | ||
// REG_INTRET and secondRetSize to REG_INTRET_1, so fix up the | ||
// situation here. | ||
assert(!EA_IS_GCREF_OR_BYREF(retSize)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] Consider adding a descriptive message to this assert
(e.g., && "Expected non-GC return size when swapping for INTRET ABI"
) to aid debugging if it ever fails.
assert(!EA_IS_GCREF_OR_BYREF(retSize)); | |
assert(!EA_IS_GCREF_OR_BYREF(retSize) && "Expected non-GC return size when swapping for INTRET ABI"); |
Copilot uses AI. Check for mistakes.
On SysV AMD64, structs returned in a float and int reg pair were being classified as RT_Scalar_XX. This causes downstream consumers (e.g., HijackFrame::GcScanRoots) to look for obj/byref's in the second int reg. Per the ABI, however, the first float is passed through a float reg and the obj/byref is passed through the _first_ int reg. We now detect and fix this case by skipping the first float type in the ReturnKind encoding and moving the second type into the first. Fix dotnet#115815
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. please get a code review. we will take for consideration in 9.0.x
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
74a39ca
into
dotnet:release/9.0-staging
Backport of #115827 to release/9.0-staging. Includes @zengandrew's change in #116194.
Customer Impact
The JIT may generate incorrect GC information on linux-x64 and osx-x64 for methods that return in two registers if the first returned register is a SIMD register and the second returned register is an integer register containing an object reference.
In that case the generated GC information incorrectly reports an object reference in the
rdx
register that is actually present in therax
register.This requires a struct that is laid out by the VM as a float followed by an object reference. This is an unusual layout as the VM tries to reorder struct layouts to keep object references at the beginning. Yet the customer reported issue #115815 shows a case where it happens. A reduced test case looks something like:
The GC info mismatch leads to unpredictable outcomes (segfaults, heap corruption, etc.).
The VM has a similar problem when it wants to suspend all running threads if it needs to suspend in a method like
GetValue
above. In those cases the VM may not properly preserve the GC reference returned, and may mistakenly treat thedouble
as a GC reference instead.The PR includes fixes for both the JIT and VM issue.
Regression
As far as I can tell this bug is present since the original introduction of SysV support.
Testing
Regression test introduced.
Risk
Low.