Apply my post-commit comment on D81995. The negative name misguided commit
d8a8e5d6240a1db809cd95106910358e69bbf299 (`[clang][cli] Remove marshalling from
Opt{In,Out}FFlag`) to:
* accidentally flip the option to not emit the xray_fn_idx section.
* change -fno-xray-function-index (instead of -fxray-function-index) to emit xray_fn_idx
This patch renames XRayOmitFunctionIndex and makes -fxray-function-index emit
xray_fn_idx, but the default remains -fno-xray-function-index .
Consider only targets where `MCAsmInfo::ExceptionsType == ExceptionHandling::None`
and that support CFI (when `MCAsmInfo::UsesCFIForDebug` is set to true):
currently, only AMDGPU.
This patch enables the emission of CFI information in the .eh_frame
section when the uwtable attribute is present on a function.
Before, we could generate CFI information for debugging puproses only.
This patch prepares AMDGPU to support collecting GPU stack traces in the future.
I did a first implementation (https://reviews.llvm.org/D139024)
but at the time I had not realized that no other platform used
`UsesCFIForDebug`.
Reviewed By: scott.linder
Differential Revision: https://reviews.llvm.org/D151806
This information helps to avoid considering cloning for blocks with indirect branches.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D150611
Currently we use RTTI objects to check type compatibility. To support non-unique
RTTI objects, commit 5745eccef54ddd3caca278d1d292a88b2281528b added a
`checkTypeInfoEquality` string matching to the runtime.
The scheme is inefficient.
```
_Z1fv:
.long 846595819 # jmp
.long .L__llvm_rtti_proxy-_Z3funv
...
main:
...
# Load the second word (pointer to the RTTI object) and dereference it.
movslq 4(%rsi), %rax
movq (%rax,%rsi), %rdx
# Is it the desired typeinfo object?
leaq _ZTIFvvE(%rip), %rax
# If not, call __ubsan_handle_function_type_mismatch_v1, which may recover if checkTypeInfoEquality allows
cmpq %rax, %rdx
jne .LBB1_2
...
.section .data.rel.ro,"aw",@progbits
.p2align 3, 0x0
.L__llvm_rtti_proxy:
.quad _ZTIFvvE
```
Let's replace the indirect `_ZTI` pointer with a type hash similar to
`-fsanitize=kcfi`.
```
_Z1fv:
.long 3238382334
.long 2772461324 # type hash
main:
...
# Load the second word (callee type hash) and check whether it is expected
cmpl $-1522505972, -4(%rax)
# If not, fail: call __ubsan_handle_function_type_mismatch
jne .LBB2_2
```
The RTTI object derives its name from `clang::MangleContext::mangleCXXRTTI`,
which uses `mangleType`. `mangleTypeName` uses `mangleType` as well. So the
type compatibility change is high-fidelity.
Since we no longer need RTTI pointers in
`__ubsan::__ubsan_handle_function_type_mismatch_v1`, let's switch it back to
version 0, the original signature before
e215996a2932ed7c472f4e94dc4345b30fd0c373 (2019).
`__ubsan::__ubsan_handle_function_type_mismatch_abort` is not
recoverable, so we can revert some changes from
e215996a2932ed7c472f4e94dc4345b30fd0c373.
Reviewed By: samitolvanen
Differential Revision: https://reviews.llvm.org/D148785
The current implementation of -fsanitize=function places two words (the prolog
signature and the RTTI proxy) at the function entry, which makes the feature
incompatible with Intel Indirect Branch Tracking (IBT) that needs an ENDBR instruction
at the function entry. To allow the combination, move the two words before the
function entry, similar to -fsanitize=kcfi.
Armv8.5 Branch Target Identification (BTI) has a similar requirement.
Note: for IBT and BTI, whether a function gets a marker instruction at the entry
generally cannot be assumed (it can be disabled by a function attribute or
stronger LTO optimizations).
It is extremely unlikely for two words preceding a function entry to be
inaccessible. One way to achieve this is by ensuring that a function is
aligned at a page boundary and making the preceding page unmapped or
unreadable. This is not reasonable for application or library code.
(Think: the first text section has crt* code not instrumented by
-fsanitize=function.)
We use 0xc105cafe for all targets. .long 0xc105cafe disassembles to invalid
instructions on all architectures I have tested, except Power where it is
`lfs 8, -13570(5)` (Load Floating-Point with a weird offset, unlikely to be used in real code).
---
For the removed function in AsmPrinter.cpp, remove an assert: `mdconst::extract`
already asserts non-nullness.
For compiler-rt/test/ubsan/TestCases/TypeCheck/Function/function.cpp,
when the function doesn't have prolog/epilog (-O1 and above), after moving the two words,
the address of the function equals the address of ret instruction,
so symbolizing the function will additionally get a non-zero column number.
Adjust the test to allow an optional column number.
```
.long 3238382334
.long .L__llvm_rtti_proxy-_Z1fv
_Z1fv: // symbolizing here retrieves the line table entry from the second .loc
.file 0 ...
.loc 0 1 0
.cfi_startproc
.loc 0 2 1 prologue_end
retq
```
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D148665
This patch encapsulates the encoding and decoding logic of basic block metadata into the Metadata struct, and also reduces the decoded size of `SHT_LLVM_BB_ADDR_MAP` section.
The patch would've looked more readable if we could use designated initializer, but that is a c++20 feature.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D148360
This change will allow to put code pointers in DWARF info fields that are larger than actual pointer size, e.g. 16-bit pointers into 32-bit fields.
The need for this came up while creating support for MSP430 in LLDB. MSP430-GCC already generates DWARF info with 32-bit fields, so this change is necessary for LLDB to maintain compatibility with both GCC and LLVM binaries. Moreover, right now in LLDB there is no support for having DWARF pointer size different from ELF header type, e.g. 16-bit DWARF info within ELF32, and it seems there is no such thing as ELF16.
Since other mainline targets are made to have the same pointer size in both MCAsmInfo and DataLayout, there is no need to change anything there.
Reviewed By: dblaikie
Differential Revision: https://reviews.llvm.org/D148042
The placement is currently wrong in the presence of function entry related
instrumentations (prefixdata, -fpatchable-function-entry=, -fsanitize=kcfi,
etc).
Currenty, setting the -mbb-profile-dump dumps a CSV file with blocks
inside an individual function identified by their MBB numbers. This
patch changes the MBBs to be identified by their ID which is set at MBB
creation and not changed afterwards, making it inherently stable
throughout the backend. This alleviates concerns with the MBB IDs
changing between the profile dump and what ends up in the final object
file. The MBBs inside the SHT_LLVM_BB_ADDR_MAP sections are also
identified using their MBB ID rather than number, so if we want to match
them up we need to identify the MBBs here by number.
Reviewed By: mtrofin, rahmanl
Differential Revision: https://reviews.llvm.org/D147366
For Big Endian, the function `emitGlobalConstantLargeInt` tries to right shift `Realigned` by an amount `ExtraBitSize` in place. However, if the constant to emit has a bit width less than 64 and the bit width is not a multiple of 8, the shift amount will be greater than the bit width of `Realigned`, which causes assertion error described in issue [[ https://github.com/llvm/llvm-project/issues/59055 | issue #59055 ]].
This patch fixes the issue by avoiding right shift when bit width is under 64 to avoid the assertion error.
Reviewed By: Peter
Differential Revision: https://reviews.llvm.org/D138246
This reverts commit db6a979ae82410e42430e47afa488936ba8e3025.
Reland D102817 without any change. The previous revert was a mistake.
Differential Revision: https://reviews.llvm.org/D102817
This patch adds a basic block profile dump option within the AsmPrinter
and dumps basic block profile information so that cost models can use
the data for downstream analysis.
Differential Revision: https://reviews.llvm.org/D143311
Emit all constant integers produced by SanitizerBinaryMetadata as
ULEB128 to further reduce binary space used. Increasing the version is
not necessary given this change depends on (and will land) along with
the bump to v2.
To support this, the !pcsections metadata format is extended to allow
for per-section options, encoded in the first MD operator which must
always be a string and contain the section: "<section>!<options>".
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D143484
The Language Reference says that aliases can have available_externally
linkage if their aliasee is an available_externally global value. Using
this kind of aliases resulted in crashes during code generation, filter
them out (the same that the AsmPrinter also filters out GlobalVariables
in emitSpecialLLVMGlobal(); Functions are discarded in the machine pass
infrastructure).
Differential Revision: https://reviews.llvm.org/D142352
Computing EH-related information was only relevant for analysis passes so far. Lifting it to IR will allow the IR Verifier to calculate EH funclet coloring and validate funclet operand bundles in a follow-up step.
Reviewed By: rnk, compnerd
Differential Revision: https://reviews.llvm.org/D138122
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows.
- Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
- Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
- While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
- The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
- In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
- Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
generation code from DWARFLinker. It adds command line option:
--build-accelerator [none,DWARF]
Build accelerator tables(default: none)
=none - Do not build accelerators
=DWARF - Build accelerator tables according to the resulting DWARF version
DWARFv4: .debug_pubnames and .debug_pubtypes
DWARFv5: .debug_names
Differential Revision: https://reviews.llvm.org/D139638
This is a follow up to D141317 which extends the common code to include a target independent pseudo instruction. This is an alternative to (subset of) D92842 which tries to be as close to NFC as possible.
A couple things to call out.
* The test change in X86 is because we loose the scheduling information on the instruction. However, I think this was actually a bug in x86 since no instruction was emitted for a MEMBARRIER. Concluding that a meta instruction has latency just seems wrong?
* I intentionally left some parts of D92842 out. Specifically, several of the changes in the X86 code (data independence and outlining) appear functional, and likely worthy of their own review. Additionally, I'm not handling ARM/AArch64 at all. Those targets need the ordering whereas none of the others do. I want to get this in and tested before retrofitting in ordering to support those targets.
Differential Revision: https://reviews.llvm.org/D141408
Following support from the previous patches in this stack being added for
variadic DBG_INSTR_REFs to exist, this patch modifies LiveDebugValues to
handle those instructions. Support already exists for DBG_VALUE_LISTs, which
covers most of the work needed to handle these instructions; this patch only
modifies the transferDebugInstrRef function to correctly track them.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D133927
Prior to this patch, variadic DIExpressions (i.e. ones that contain
DW_OP_LLVM_arg) could only be created by salvaging debug values to create
stack value expressions, resulting in a DBG_VALUE_LIST being created. As of
the previous patch in this patch stack, DBG_INSTR_REF's syntax has been
changed to match DBG_VALUE_LIST in preparation for supporting variadic
expressions. This patch adds some minor changes needed to allow variadic
expressions that aren't stack values to exist, and allows variadic expressions
that are trivially reduceable to non-variadic expressions to be handled
similarly to non-variadic expressions.
Reviewed by: jmorse
Differential Revision: https://reviews.llvm.org/D133926
The field is currently `void*`, which was originlly chosen in 2010 to not need to include `DenseMap`. Since then, `DenseMap` has been included in the header file anyways, so there is no more need to for the indirection via `void*` and the cruft around it can be removed.
Differential Revision: https://reviews.llvm.org/D140758
`DEBUG_VALUE` comments are printed before an instruction, so they are
not printed with `AddComment` method as other comments are, but printed
using `emitRawComment` method. But currently `emitDebugValueComment`
calls `emitRawComment` twice for target-index-based `DBG_VALUE`s: once
in the `switch`-`case`,
d77ae7f251/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (L1192-L1193)
and again at the end of the method:
d77ae7f251/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp (L1227-L1228)
This makes them printed twice. I think this happened through multiple
commits modifying and refactoring this method.
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D139579
Next patch after D139548 and D139439. Same expectations, the change seems safe with as far as llvm goes, we cannot check downstream implementations.
Differential Revision: https://reviews.llvm.org/D139614
In the same vein as D139439, the patch is not NFC as there is no way to check all downstream implementations but the patch seems pretty safe.
Differential Revision: https://reviews.llvm.org/D139548
Before performing this change, I checked that `ByteAlignment` was never `0` inside `MCAsmStreamer:emitZeroFill` and `MCAsmStreamer::emitLocalCommonSymbol`.
I believe it is NFC as `0` values are illegal in `emitZeroFill` anyways, `Log2(ByteAlignment)` would be undefined.
And currently, all calls to `emitLocalCommonSymbol` are provably `>0`.
Differential Revision: https://reviews.llvm.org/D139439
This breaks Windows bots with
`warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)`
Some shift operators are lacking a proper literal unit ('1ULL' instead of
'1'). Will reland once fixed.
This reverts commit c621c1a8e81856e6bf2be79714767d80466e9ede.
Before performing this change, I checked that `ByteAlignment` was never `0` inside `MCAsmStreamer:emitZeroFill` and `MCAsmStreamer::emitLocalCommonSymbol`.
I believe it is NFC as `0` values are illegal in `emitZeroFill` anyways, `Log2(ByteAlignment)` would be undefined.
And currently, all calls to `emitLocalCommonSymbol` are provably `>0`.
Differential Revision: https://reviews.llvm.org/D139439
Let Propeller use specialized IDs for basic blocks, instead of MBB number.
This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.
####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows.
- Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
- Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
- While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
- The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
- In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
- Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.
To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.
The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.
####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D100808
With D135642 ignoring unregistered symbols, isTransitiveUsedByMetadataOnly added
by D101512 is no longer needed (the operation is potentially slow). There is a
`.addrsig_sym` directive for an only-used-by-metadata symbol but it does not
emit an entry.
Reviewed By: rnk
Differential Revision: https://reviews.llvm.org/D138362
This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad701522988e212495285dade8efac41a24d4.
This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).
Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.
NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.
The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules:
First, no exception can move in or out of _try region., i.e., no "potential faulty
instruction can be moved across _try boundary.
Second, the order of exceptions for instructions 'directly' under a _try must be preserved
(not applied to those in callees).
Finally, global states (local/global/heap variables) that can be read outside of _try region
must be updated in memory (not just in register) before the subsequent exception occurs.
The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding process.
Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.
This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:
A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others. If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:
Warning: jump bypasses variable with a non-trivial destructor
In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation (already in):
Please see commit 797ad701522988e212495285dade8efac41a24d4).
Part-2 : LLVM implementation described below.
For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).
For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.
The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.
Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.htmlhttps://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html
Differential Revision: https://reviews.llvm.org/D102817/new/
Extends the Asm reader/writer to support reading and writing the
'.memtag' directive (including allowing it on internal global
variables). Also add some extra tooling support, including objdump and
yaml2obj/obj2yaml.
Test that the sanitize_memtag IR attribute produces the expected asm
directive.
Uses the new Aarch64 MemtagABI specification
(https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst)
to identify symbols as tagged in object files. This is done using a
R_AARCH64_NONE relocation that identifies each tagged symbol, and these
relocations are tagged in a special SHT_AARCH64_MEMTAG_GLOBALS_STATIC
section. This signals to the linker that the global variable should be
tagged.
Reviewed By: fmayer, MaskRay, peter.smith
Differential Revision: https://reviews.llvm.org/D128958
This patch makes code less readable but it will clean itself after all functions are converted.
Differential Revision: https://reviews.llvm.org/D138665
The existing behaviors and callbacks were overlapping and had very
confusing semantics: beginBasicBlock/endBasicBlock were not always
called, beginFragment/endFragment seemed like they were meant to mean
the same thing, but were slightly different, etc. This resulted in
confusing semantics, virtual method overloads, and control flow.
Remove the above, and replace with new beginBasicBlockSection and
endBasicBlockSection callbacks. And document them.
These are always called before the first and after the last blocks in
a function, even when basic-block-sections are disabled.
basic block section cases
MachineBlockPlacement pass sets an alignment attribute to the loop
header MBB and this attribute will lead to an alignment directive during
emitting asm. In the case of the basic block section, the alignment
directive is put before the section label, and thus the alignment is set
to the predecessor of the loop header, which is not what we expect and
increases the code size (both inserting nop and set section alignment).
Reviewed By: rahmanl
Differential Revision: https://reviews.llvm.org/D137535
Interpret MD_pcsections in AsmPrinter emitting the requested metadata to
the associated sections. Functions and normal instructions are handled.
Differential Revision: https://reviews.llvm.org/D130879
This patch is essentially an alternative to https://reviews.llvm.org/D75836 and was mentioned by @lhames in a comment.
The gist of the issue is that Mach-O has restrictions on which kind of sections are allowed after debug info has been emitted, which is also properly asserted within LLVM. Problem is that stack maps are currently emitted as one of the last sections in each target-specific AsmPrinter so far, which would cause the assertion to trigger. The current approach of special casing for the `__LLVM_STACKMAPS` section is not viable either, as downstream users can overwrite the stackmap format using plugins, which may want to use different sections.
This patch fixes the issue by emitting the stack map earlier, right before debug info is emitted. The way this is implemented is by taking the choice when to emit the StackMap away from the target AsmPrinter and doing so in the base class. The only disadvantage of this approach is that the `StackMaps` member is now part of the base class, even for targets that do not support them. This is functionaly not a problem however, as emitting an empty `StackMaps` is a no-op.
Differential Revision: https://reviews.llvm.org/D132708
Warn if `.size` is specified for a function symbol. The size of a
function symbol is determined solely by its content.
I noticed this simplification was possible while debugging #57427, but
this change doesn't fix that specific issue.
Differential Revision: https://reviews.llvm.org/D132929