Skip to content

[release/9.0-staging] Fix SysV first/second return register GC info mismatch #116206

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jakobbotsch
Copy link
Member

@jakobbotsch jakobbotsch commented Jun 2, 2025

Backport of #115827 to release/9.0-staging. Includes @zengandrew's change in #116194.

Customer Impact

  • Customer reported
  • Found internally

The JIT may generate incorrect GC information on linux-x64 and osx-x64 for methods that return in two registers if the first returned register is a SIMD register and the second returned register is an integer register containing an object reference.
In that case the generated GC information incorrectly reports an object reference in the rdx register that is actually present in the rax register.

This requires a struct that is laid out by the VM as a float followed by an object reference. This is an unusual layout as the VM tries to reorder struct layouts to keep object references at the beginning. Yet the customer reported issue #115815 shows a case where it happens. A reduced test case looks something like:

private struct Container
{
    public string Name;
}

[MethodImpl(MethodImplOptions.NoInlining)]
// KeyValuePair<Container, double> is laid out as (double, string) by VM
private static KeyValuePair<Container, double> GetValue(int i)
    => KeyValuePair.Create(new Container(i.ToString()), (double)i); 

... GetValue(i) ... // JIT reports a string in rdx after this call that is actually in rax

The GC info mismatch leads to unpredictable outcomes (segfaults, heap corruption, etc.).

The VM has a similar problem when it wants to suspend all running threads if it needs to suspend in a method like GetValue above. In those cases the VM may not properly preserve the GC reference returned, and may mistakenly treat the double as a GC reference instead.

The PR includes fixes for both the JIT and VM issue.

Regression

  • Yes
  • No

As far as I can tell this bug is present since the original introduction of SysV support.

Testing

Regression test introduced.

Risk

Low.

@Copilot Copilot AI review requested due to automatic review settings June 2, 2025 09:02
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Backport fix for SysV first/second return register GC info mismatch on linux-x64 and osx-x64.

  • Added a new regression test to reproduce and guard against the GC info misreport.
  • Updated genCallInstruction to swap return-size metadata when the second SysV return register is an integer, ensuring the GC map reflects the true register holding the object reference.

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.csproj Added project file to include the new regression test in the test suite.
src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.cs Introduced a stress test that repeatedly calls the problematic pattern to catch misreported GC info.
src/coreclr/jit/codegenxarch.cpp Modified genCallInstruction to detect when the second return ABI register is REG_INTRET and swap retSize/secondRetSize accordingly.
Comments suppressed due to low confidence (1)

src/tests/JIT/Regression/JitBlue/Runtime_115815/Runtime_115815.cs:12

  • The test currently only ensures no crash occurs. Consider adding assertions inside the loops to verify that GetValue(j) returns the expected Container.Name and double value, which would catch logical regressions beyond GC misreporting.
public static void TestEntryPoint()

// The emitter has hardcoded belief that retSize corresponds to
// REG_INTRET and secondRetSize to REG_INTRET_1, so fix up the
// situation here.
assert(!EA_IS_GCREF_OR_BYREF(retSize));
Copy link
Preview

Copilot AI Jun 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding a descriptive message to this assert (e.g., && "Expected non-GC return size when swapping for INTRET ABI") to aid debugging if it ever fails.

Suggested change
assert(!EA_IS_GCREF_OR_BYREF(retSize));
assert(!EA_IS_GCREF_OR_BYREF(retSize) && "Expected non-GC return size when swapping for INTRET ABI");

Copilot uses AI. Check for mistakes.

On SysV AMD64, structs returned in a float and int reg pair were being
classified as RT_Scalar_XX. This causes downstream consumers
(e.g., HijackFrame::GcScanRoots) to look for obj/byref's in
the second int reg. Per the ABI, however, the first float is passed
through a float reg and the obj/byref is passed through the _first_
int reg. We now detect and fix this case by skipping the first float
type in the ReturnKind encoding and moving the second type into the
first.

Fix dotnet#115815
Copy link
Member

@jeffschwMSFT jeffschwMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. please get a code review. we will take for consideration in 9.0.x

@jeffschwMSFT jeffschwMSFT added Servicing-consider Issue for next servicing release review area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI labels Jun 2, 2025
@jeffschwMSFT jeffschwMSFT added this to the 9.0.x milestone Jun 2, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jakobbotsch jakobbotsch requested review from jkotas, VSadov and a team June 3, 2025 09:48
@rbhanda rbhanda modified the milestones: 9.0.x, 9.0.7 Jun 3, 2025
@rbhanda rbhanda added Servicing-approved Approved for servicing release and removed Servicing-consider Issue for next servicing release review labels Jun 3, 2025
@jakobbotsch jakobbotsch changed the title [release/9.0-staging] JIT: Fix SysV first/second return register GC info mismatch when generating calls [release/9.0-staging] Fix SysV first/second return register GC info mismatch Jun 3, 2025
@jakobbotsch jakobbotsch merged commit 74a39ca into dotnet:release/9.0-staging Jun 4, 2025
106 of 110 checks passed
@jakobbotsch jakobbotsch deleted the backport-115827-9.0 branch June 4, 2025 09:00
@github-actions github-actions bot locked and limited conversation to collaborators Jul 5, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI Servicing-approved Approved for servicing release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants