-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[DebugInfo][DWARF] Add heapallocsite information #132073
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-debuginfo @llvm/pr-subscribers-llvm-binary-utilities Author: Jann (thejh) ChangesLLVM currently stores heapallocsite information in CodeView debuginfo, but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific extension. heapallocsite debug information is useful when it is combined with allocator instrumentation that stores caller addresses; I've used a previous version of this patch for:
Other possible uses might be:
Full diff: https://github.com/llvm/llvm-project/pull/132073.diff 6 Files Affected:
diff --git a/llvm/include/llvm/BinaryFormat/Dwarf.def b/llvm/include/llvm/BinaryFormat/Dwarf.def
index e52324a8ebc12..51660b30568a5 100644
--- a/llvm/include/llvm/BinaryFormat/Dwarf.def
+++ b/llvm/include/llvm/BinaryFormat/Dwarf.def
@@ -624,6 +624,7 @@ HANDLE_DW_AT(0x3e09, LLVM_ptrauth_authenticates_null_values, 0, LLVM)
HANDLE_DW_AT(0x3e0a, LLVM_ptrauth_authentication_mode, 0, LLVM)
HANDLE_DW_AT(0x3e0b, LLVM_num_extra_inhabitants, 0, LLVM)
HANDLE_DW_AT(0x3e0c, LLVM_stmt_sequence, 0, LLVM)
+HANDLE_DW_AT(0x3e0d, LLVM_heapallocsite, 0, LLVM)
// Apple extensions.
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
index dad14b98d2c3d..f14742d048c39 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
@@ -1289,12 +1289,10 @@ DwarfCompileUnit::getDwarf5OrGNULocationAtom(dwarf::LocationAtom Loc) const {
}
}
-DIE &DwarfCompileUnit::constructCallSiteEntryDIE(DIE &ScopeDIE,
- const DISubprogram *CalleeSP,
- bool IsTail,
- const MCSymbol *PCAddr,
- const MCSymbol *CallAddr,
- unsigned CallReg) {
+DIE &DwarfCompileUnit::constructCallSiteEntryDIE(
+ DIE &ScopeDIE, const DISubprogram *CalleeSP, bool IsTail,
+ const MCSymbol *PCAddr, const MCSymbol *CallAddr, unsigned CallReg,
+ DIType *AllocSiteTy) {
// Insert a call site entry DIE within ScopeDIE.
DIE &CallSiteDIE = createAndAddDIE(getDwarf5OrGNUTag(dwarf::DW_TAG_call_site),
ScopeDIE, nullptr);
@@ -1303,7 +1301,7 @@ DIE &DwarfCompileUnit::constructCallSiteEntryDIE(DIE &ScopeDIE,
// Indirect call.
addAddress(CallSiteDIE, getDwarf5OrGNUAttr(dwarf::DW_AT_call_target),
MachineLocation(CallReg));
- } else {
+ } else if (CalleeSP) {
DIE *CalleeDIE = getOrCreateSubprogramDIE(CalleeSP);
assert(CalleeDIE && "Could not create DIE for call site entry origin");
if (AddLinkageNamesToDeclCallOriginsForTuning(DD) &&
@@ -1348,6 +1346,10 @@ DIE &DwarfCompileUnit::constructCallSiteEntryDIE(DIE &ScopeDIE,
getDwarf5OrGNUAttr(dwarf::DW_AT_call_return_pc), PCAddr);
}
+ if (AllocSiteTy) {
+ addType(CallSiteDIE, AllocSiteTy, dwarf::DW_AT_LLVM_heapallocsite);
+ }
+
return CallSiteDIE;
}
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
index 104039db03c7c..52186d29c8272 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfCompileUnit.h
@@ -285,7 +285,8 @@ class DwarfCompileUnit final : public DwarfUnit {
/// the \p CallReg is set to 0.
DIE &constructCallSiteEntryDIE(DIE &ScopeDIE, const DISubprogram *CalleeSP,
bool IsTail, const MCSymbol *PCAddr,
- const MCSymbol *CallAddr, unsigned CallReg);
+ const MCSymbol *CallAddr, unsigned CallReg,
+ DIType *AllocSiteTy);
/// Construct call site parameter DIEs for the \p CallSiteDIE. The \p Params
/// were collected by the \ref collectCallSiteParameters.
/// Note: The order of parameters does not matter, since debuggers recognize
diff --git a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
index b7b083028ef65..62ffe6e6e6c38 100644
--- a/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp
@@ -930,28 +930,29 @@ void DwarfDebug::constructCallSiteEntryDIEs(const DISubprogram &SP,
if (MI.hasDelaySlot() && !delaySlotSupported(*&MI))
return;
+ DIType *AllocSiteTy = cast_or_null<DIType>(MI.getHeapAllocMarker());
+
// If this is a direct call, find the callee's subprogram.
// In the case of an indirect call find the register that holds
// the callee.
const MachineOperand &CalleeOp = TII->getCalleeOperand(MI);
- if (!CalleeOp.isGlobal() &&
- (!CalleeOp.isReg() || !CalleeOp.getReg().isPhysical()))
- continue;
unsigned CallReg = 0;
const DISubprogram *CalleeSP = nullptr;
const Function *CalleeDecl = nullptr;
- if (CalleeOp.isReg()) {
- CallReg = CalleeOp.getReg();
- if (!CallReg)
- continue;
- } else {
+ if (CalleeOp.isReg() && CalleeOp.getReg().isPhysical()) {
+ CallReg = CalleeOp.getReg(); // might be zero
+ } else if (CalleeOp.isGlobal()) {
CalleeDecl = dyn_cast<Function>(CalleeOp.getGlobal());
- if (!CalleeDecl || !CalleeDecl->getSubprogram())
- continue;
- CalleeSP = CalleeDecl->getSubprogram();
+ if (CalleeDecl)
+ CalleeSP = CalleeDecl->getSubprogram(); // might be nullptr
}
+ // Omit DIE if we can't tell where the call goes *and* we don't want to
+ // add metadata to it.
+ if (CalleeSP == nullptr && CallReg == 0 && AllocSiteTy == nullptr)
+ continue;
+
// TODO: Omit call site entries for runtime calls (objc_msgSend, etc).
bool IsTail = TII->isTailCall(MI);
@@ -986,7 +987,7 @@ void DwarfDebug::constructCallSiteEntryDIEs(const DISubprogram &SP,
<< (IsTail ? " [IsTail]" : "") << "\n");
DIE &CallSiteDIE = CU.constructCallSiteEntryDIE(
- ScopeDIE, CalleeSP, IsTail, PCAddr, CallAddr, CallReg);
+ ScopeDIE, CalleeSP, IsTail, PCAddr, CallAddr, CallReg, AllocSiteTy);
// Optionally emit call-site-param debug info.
if (emitDebugEntryValues()) {
diff --git a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
index f66773ad2e694..586d1c2abd5e0 100644
--- a/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
+++ b/llvm/lib/DWARFLinker/Classic/DWARFLinker.cpp
@@ -109,6 +109,7 @@ static bool isODRAttribute(uint16_t Attr) {
case dwarf::DW_AT_specification:
case dwarf::DW_AT_abstract_origin:
case dwarf::DW_AT_import:
+ case dwarf::DW_AT_LLVM_heapallocsite:
return true;
}
llvm_unreachable("Improper attribute.");
diff --git a/llvm/test/DebugInfo/X86/DW_AT_LLVM_heapallocsite.ll b/llvm/test/DebugInfo/X86/DW_AT_LLVM_heapallocsite.ll
new file mode 100644
index 0000000000000..53975e02acbc3
--- /dev/null
+++ b/llvm/test/DebugInfo/X86/DW_AT_LLVM_heapallocsite.ll
@@ -0,0 +1,33 @@
+; RUN: llc -O3 -o %t -filetype=obj %s
+; RUN: llvm-dwarfdump %t | FileCheck %s
+
+; based on clang++ output for `int *alloc_int() { return new int; }`
+
+
+target triple = "x86_64-unknown-linux-gnu"
+
+define dso_local ptr @alloc_int() !dbg !3 {
+; CHECK: DW_TAG_subprogram
+entry:
+ %call = call ptr @alloc(i64 noundef 4), !heapallocsite !7
+; CHECK: DW_TAG_GNU_call_site
+; CHECK: DW_AT_LLVM_heapallocsite ([[ALLOCSITE:.*]])
+ ret ptr %call
+}
+
+; CHECK: {{.*}}[[ALLOCSITE]]: DW_TAG_base_type
+; CHECK: DW_AT_name ("int")
+
+declare dso_local ptr @alloc(i64 noundef)
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, emissionKind: FullDebug)
+!1 = !DIFile(filename: "a.cpp", directory: "/")
+!2 = !{i32 2, !"Debug Info Version", i32 3}
+!3 = distinct !DISubprogram(name: "alloc_int", scope: !1, file: !1, line: 1, type: !4, scopeLine: 1, flags: DIFlagPrototyped | DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition, unit: !0)
+!4 = !DISubroutineType(types: !5)
+!5 = !{!6}
+!6 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !7, size: 64)
+!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
|
ping - did I create this pull request correctly, or should I be CCing someone other than @JDevlieghere ? |
I believe people are just generally overstretched in this area... I've added a few others who may have opinions here as well. |
Got any size data (like a bloaty before/after comparison of the clang binary with/without this patch?)? |
775c75a
to
2674248
Compare
@dwblaikie You mean a size comparison of clang built with debuginfo at a specific revision, built once using unmodified clang and once using clang with this patch applied? Here you go - I built clang twice at git commit
Grepping the output of running If you think that's too much to add in by default when there is currently no tooling that can make use of this, I'd be happy to gate this on some new default-disabled flag. Yes, I have a planned usecase - I would like to eventually use this feature for augmenting Linux kernel memory access traces, and maybe also ASAN crash reports (which already report allocator call site addresses), with type information. However, that usecase requires more LLVM/clang patches on top of this, since Linux kernel code uses something like Another usecase I had in the past, for which I originally wrote this patch, was to analyze heap memory usage of a C++ codebase (like seeing which classes use how much memory, or seeing statistics for what the most common values of some class member are) - for that usecase, this patch alone mostly did the job (combined with allocator changes to record callsites). Since this patch was already independently useful for that kind of usecase, I figured I might as well land this first. |
For that old usecase, the tooling I built on top of DWARF heapallocsite debuginfo was able to print heap analysis like this: https://gist.github.com/thejh/92befd69a727b02aa4262075c554a872 That shows heap allocations grouped by allocation type, showing the three most common values of each member along with how often that value occurs. |
Does it make sense to propose the extension first to dwarf standard (https://dwarfstd.org/) before adding it to LLVM? |
@pinskia I'm not familiar with how DWARF standard development works in practice - do you mean just propose it for the DWARF standard, then add it to LLVM as an LLVM extension in parallel while the DWARF committee figures out whether they want to standardize it? Or actually land it in the (working draft? official?) DWARF standard before adding the feature to LLVM at all? |
First start with https://dwarfstd.org/procedures.html . |
The new attribute should be documented. It doesn't have to be proposed to the DWARF committee, but that would be a good idea if anyone wants to implement the same feature in another compiler. I take it there's no gcc equivalent? |
@pinskia Sounds reasonable, I'll try to write and submit a proposal. |
I have now submitted a DWARF enhancement proposal to dwarf-discuss (but an issue number has not yet been assigned): https://lists.dwarfstd.org/pipermail/dwarf-discuss/2025-April/002656.html
I am not aware of one. |
The proposal is now listed at https://dwarfstd.org/issues/250407.1.html . |
The DWARF v6 working draft now contains this feature as Which makes me wonder - would you prefer to directly use the ID from the draft standard in LLVM, or would you prefer to use an ID in LLVM-private ID space until the standard is actually finalized? |
As unfortunate as it is, I'd prefer to use an LLVM-private ID space until the standard numbering is in a published standard - don't want to risk the complications that would come up if the number was reassigned before the standard ships. |
2674248
to
651cdff
Compare
I just pushed a new version which uses the name |
// Omit DIE if we can't tell where the call goes *and* we don't want to | ||
// add metadata to it. | ||
if (CalleeSP == nullptr && CallReg == 0 && AllocSiteTy == nullptr) | ||
continue; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these conditions tested?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The four cases that could be tested here are basically:
- a call to a known target subprogram has a
DW_TAG_call_site
- a call through a register has a
DW_TAG_call_site
- a call with heapallocsite metadata has a
DW_TAG_call_site
- an indirect call with complex addressing (such as a
call [rax]
on X86 in Intel syntax) has noDW_TAG_call_site
Case 1 is covered by tests like ./llvm/test/DebugInfo/X86/dwarf-callsite-related-attrs.ll
, for example in function _Z3foov
.
Case 2 is (bogusly) covered by the same file, function main
(with the indirect_target
call); pull request #151378 adds a proper test.
Case 3 is covered by the test I added.
Case 4 is a size optimization that only happens when LLVM gives up on emitting DWARF for complex calls. It is not currently tested, from what I can tell; pull request #151378 adds a proper test. Most ways that should end up in case 4 have been very broken at least on X86 for years. The main thing I noticed is that when LLVM is generating something like call [rax]
, it emits debuginfo claiming that the target address is stored in RAX because the only difference at the MIR level between call rax
and call [rax]
is which call opcode is used (CALL64r
vs CALL64m
), and the generic MIR code considers both of these to have a kRegister
operand even though the X86 code treats them differently (OperandType::TYPE_R64
vs OperandType::TYPE_M
). On top of that, another bug is that offsets are also ignored (#70949). Due to these bugs, we probably almost never actually hit this continue
- the only way I managed to make that happen on x86 is by compiling this with -m32
, to generate a call with unknown destination that has no registers involved:
static int (*const foo)() = (void*)0x123;
int bar() {
return foo() + 1;
}
I think if we want to have nice tests for cases 2 and 4, it would make sense to fix the handling of call [rax]
first, so that we can for now test that call rax
has callsite info while call [rax]
doesn't (until someone comes along and adds support for that, then I guess we'd have to come up with a new way to test it). (Or I guess alternatively one could make an argument that we should remove this branch entirely and accept that we're generating useless callsite tags sometimes.)
I have created PR #151378 to remove the broken DWARF for call [rax]
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(And once that PR lands, I'll have to rebase this change.)
LLVM currently stores heapallocsite information in CodeView debuginfo, but not in DWARF debuginfo. heapallocsite information is useful for profiling and memory analysis. The DWARF v6 working draft includes this feature as DW_AT_alloc_type; but since that is not yet an official standard, use LLVM-private ID space for now.
651cdff
to
87b4e34
Compare
LLVM currently stores heapallocsite information in CodeView debuginfo, but not in DWARF debuginfo. Plumb it into DWARF as an LLVM-specific extension.
heapallocsite debug information is useful when it is combined with allocator instrumentation that stores caller addresses; I've used a previous version of this patch for:
Other possible uses might be: