36747 Commits

Author SHA1 Message Date
Jeremy Morse
b468ed494a Reapply ccddb6ffad1, "Emit a worst-case prologue_end"
In 39b2979a4 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.

Original commit message follows.

[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)

prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
2024-11-14 10:30:17 +00:00
Ricardo Jesus
e52238b59f
[AArch64] Add @llvm.experimental.vector.match (#101974)
This patch introduces an experimental intrinsic for matching the
elements of one vector against the elements of another.

For AArch64 targets that support SVE2, the intrinsic lowers to a MATCH
instruction for supported fixed and scalar vector types.
2024-11-14 09:00:19 +00:00
Kyungwoo Lee
5a2888ddbd Revert "[CGData] Refactor Global Merge Functions (#115750)"
This reverts commit d3da78863c7021fa2447a168dc03ad791db69dc6.
2024-11-13 21:23:16 -08:00
Kyungwoo Lee
d3da78863c
[CGData] Refactor Global Merge Functions (#115750)
This is a follow-up PR to refactor the initial global merge function
pass implemented in #112671.

It first collects stable functions relevant to the current module and
iterates over those only, instead of iterating through all stable
functions in the stable function map.

This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-13 21:15:19 -08:00
alx32
f407dff50c
[DebugInfo][DWARF] Emit Per-Function Line Table Offsets and End Sequences (#110192)
**Summary**

This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.

**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)

Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.

In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.

For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.


**Implementation Details**
To address the above issue, the patch makes the following key changes:

**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.

**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.

**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.

**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
2024-11-13 18:51:34 -08:00
Kyungwoo Lee
d23c5c2d65
[CGData] Global Merge Functions (#112671)
This implements a global function merging pass. Unlike traditional
function merging passes that use IR comparators, this pass employs a
structurally stable hash to identify similar functions while ignoring
certain constant operands. These ignored constants are tracked and
encoded into a stable function summary. When merging, instead of
explicitly folding similar functions and their call sites, we form a
merging instance by supplying different parameters via thunks. The
actual size reduction occurs when identically created merging instances
are folded by the linker.

Currently, this pass is wired to a pre-codegen pass, enabled by the
`-enable-global-merge-func` flag.
In a local merging mode, the analysis and merging steps occur
sequentially within a module:
- `analyze`: Collects stable function hashes and tracks locations of
ignored constant operands.
- `finalize`: Identifies merge candidates with matching hashes and
computes the set of parameters that point to different constants.
- `merge`: Uses the stable function map to optimistically create a
merged function.

We can enable a global merging mode similar to the global function
outliner
(https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/),
which will perform the above steps separately.
- `-codegen-data-generate`: During the first round of code generation,
we analyze local merging instances and publish their summaries.
- Offline using `llvm-cgdata` or at link-time, we can finalize all these
merging summaries that are combined to determine parameters.
- `-codegen-data-use`: During the second round of code generation, we
optimistically create merging instances within each module, and finally,
the linker folds identically created merging instances.

Depends on #112664
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-13 17:34:07 -08:00
Alex MacLean
7a8fe0f83c
[SelectionDAG] Fixup type usage of CondCodeAction table (#116082)
Ensure that all uses of CondCodeAction table are checking the compared
types, not the produced type. This is a prerequisite to landing #115035
2024-11-13 13:20:16 -08:00
Matt Arsenault
04d450fd8d
AtomicExpand: Preserve metadata when bitcasting fp atomicrmw xchg (#115240) 2024-11-13 12:51:18 -08:00
Augusto Noronha
67fb2686fb
[DebugInfo] Add a specification attribute to LLVM DebugInfo (#115362)
Add a specification attribute to LLVM DebugInfo, which is analogous
to DWARF's DW_AT_specification. According to the DWARF spec:
"A debugging information entry that represents a declaration that
completes another (earlier) non-defining declaration may have a
DW_AT_specification attribute whose value is a reference to the
debugging information entry representing the non-defining declaration."

This patch allows types to be specifications of other types. This is
used by Swift to represent generic types. For example, given this Swift
program:

```
struct MyStruct<T> {
    let t: T
}

let variable = MyStruct<Int>(t: 43)
```

The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>:
```
DW_TAG_structure_type
    DW_AT_name	("MyStruct")
    // "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to 
    // MyStruct<T>
    DW_AT_linkage_name	("$s1w8MyStructVyxGD")
    // other attributes here
```
And a specification for MyStruct<Int>:
```
DW_TAG_structure_type
    DW_AT_specification	(<link to "MyStruct">)
    // "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to
    // MyStruct<Int>
    DW_AT_linkage_name	("$s1w8MyStructVySiGD")
    DW_AT_byte_size	(0x08)
    // other attributes here
```
2024-11-13 09:55:37 -08:00
Jay Foad
a33ae1b7df
[LiveRangeCalc] Fix isJointlyDominated (#116020)
Check that every path from the entry block to the use block passes
through at least one def block. Previously we only checked that at least
one path passed through a def block.
2024-11-13 13:36:48 +00:00
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
Thorsten Schütt
0e97b4d05a
[GlobalISel] Combine G_MERGE_VALUES of x and undef (#113616)
into anyext x

; CHECK-NEXT: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s32),
[[DEF]](s32)

Please continue padding merge values.

//   %bits_8_15:_(s8) = G_IMPLICIT_DEF
//   %0:_(s16) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8)

%bits_8_15 is defined by undef. Its value is undefined and we can pick
an arbitrary value. For optimization, we pick anyext, which plays well
with the undefinedness.

//   %0:_(s16) = G_ANYEXT %bits_0_7:(s8)

The upper bits of %0 are undefined and the lower bits come from
%bits_0_7.
2024-11-12 23:23:32 +01:00
Benjamin Maxwell
014455a587
[SDAG] Limit sincos/frexp stack slot folding to stores chained to entry (#115906)
When the chain is not the entry node there is a risk the stores are
within a (CALLSEQ_START, CALLSEQ_END), which when the node is expanded
will lead to nested call sequences.

It should be possible to check for this and allow more cases, but for
now, let's limit this to cases where it's definitely safe.

Fixes #115323
2024-11-12 20:48:41 +00:00
Zaara Syeda
aaa37d6755
[PPC] Replace PPCMergeStringPool with GlobalMerge for Linux (#114850)
Enable merging all constants without looking at use in GlobalMerge by
default to replace PPCMergeStringPool pass on Linux.
2024-11-12 14:02:01 -05:00
Kazu Hirata
4048c64306
[llvm] Remove redundant control flow statements (NFC) (#115831)
Identified with readability-redundant-control-flow.
2024-11-12 10:09:42 -08:00
Ellis Hoag
57c33acac8
[MachineSink] Sink into consistent blocks for optsize funcs (#115367)
Do not consider profile data when choosing a successor block to sink
into for optsize functions. This should result in more consistent
instruction sequences which will improve outlining and ICF. We've
observed a slight codesize improvement in a large binary. This is
similar reasoning to https://github.com/llvm/llvm-project/pull/114607.

Using profile data to select a block to sink into was original added in
d04f7596e7.
2024-11-12 09:53:27 -08:00
Ellis Hoag
b8d6659bff
[CodeLayout] Do not flip branch condition when using optsize (#114607)
* Do not use profile data when flipping a branch condition when
optimizing for size. This should improving outlining and ICF due to more
uniform instruction sequences.
* Refactor `optimizeBranches()` to use early `continue`s
* Use the correct debug location for `insertBranch()`
2024-11-12 09:50:29 -08:00
Jeremy Morse
ccddb6ffad Revert "[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)"
This reverts commit bf483ddb42065405e345393e022dc72357ec5a3a.

See PR, there's a test testing for this behaviour (possibly adaptable), and
a duplicate line entry too
2024-11-12 17:07:56 +00:00
Nikita Popov
63fb980d50
[IR] Add helper for comparing KnownBits with IR predicate (NFC) (#115878)
Add `ICmpInst::compare()` overload accepting `KnownBits`, similar to the
existing one accepting `APInt`. This is not directly part of KnownBits
(or APInt) for layering reasons.
2024-11-12 17:41:08 +01:00
Stephen Tozer
fe18ab983d
[DebugInfo] Don't apply is_stmt on MBB branches that preserve lines (#108251)
This patch follows on from the changes made in #105524, by adding an
additional heuristic that prevents us from applying the start-of-MBB
is_stmt flag when we can see that, for all direct branches to the MBB,
the last line stepped on before the branch is the same as the first line
of the MBB. This is mainly to prevent certain pathological cases, such
as macros that expand to multiple basic blocks that all have the same
source location, from giving us repeated steps on the same line. This
approach is not comprehensive, since it relies on analyzeBranch to read
edges, but the default fallback of applying is_stmt may lead only to
useless steps in some cases, rather than skipping useful steps
altogether.
2024-11-12 16:23:35 +00:00
Jeremy Morse
bf483ddb42
[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)
prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
2024-11-12 15:09:40 +00:00
Pengcheng Wang
5a1f239df5
[MISched] Add a hook to override PostRA scheduling policy (#115455)
PostRA scheduling supports different directions now, but we can
only specify it via command line options.

This patch adds a new hook `overridePostRASchedPolicy` for targets
to override PostRA scheduling policy.

Note that some options like tracking register pressure won't take
effect in PostRA scheduling.
2024-11-12 18:14:57 +08:00
Feng Zou
28e4aad45a
[X86][BF16] Add libcall for FP128 -> BF16 (#115825)
This is to fix #115710.
2024-11-12 15:54:09 +08:00
Thorsten Schütt
e399322d5e
[GlobalISel] Import llvm.stepvector (#115721) 2024-11-11 21:35:22 +01:00
Michael Maitland
2b58458225
[MIRLexer][RISCV] Eat a space after the Machine comment (#115365)
The MIRPrinter emits ` :: ` at the start of a MMO. The MIRLexer eats all
the white space after the operand and before the `::` when there is no
comment. We need to eat the space after the comment to allow MIRLexer to
parse comments on a MMO.
2024-11-11 14:48:31 -05:00
Daniel Sanders
74003f11b3
[mc] Add CFI directive to emit val_offset() rules (#113971)
These specify that the value of the given register in the previous frame
is the CFA plus some offset. This isn't very common but can be necessary
if the original value is normally reconstructed from the stack/frame
pointer instead of being saved on the stack and reloaded from there.
2024-11-11 11:38:36 -08:00
David Green
b242ae32f5 [AArch64][GlobalISel] Protect against undef first element in CombineShuffleConcat.
In case the first element is undef, we need to look through to find a valid
type for the inputs.
2024-11-11 19:37:51 +00:00
Thorsten Schütt
a5d09f4ad9
[GlobalISel] Add G_STEP_VECTOR instruction (#115598)
aka llvm.stepvector Intrinsic
2024-11-11 10:45:02 +01:00
David Green
a4e507df7a
[AArch64][GlobalISel] Do not create LIFETIME instructions in functions. (#115669)
For the same reason that we do not translate lifetime markers in a -O0,
we should not translate them for optnone functions too.
2024-11-11 09:27:40 +00:00
David Sherwood
69b39e7cc7
[SelectionDAG] Add support for extending masked loads in computeKnownBits (#115450)
We already support computing known bits for extending loads, but not for
masked loads. For now I've only added support for zero-extends because
that's the only thing currently tested. Even when the passthru value is
poison we still know the top X bits are zero.
2024-11-11 09:17:49 +00:00
Kazu Hirata
5b19ed8bb4
[llvm] Migrate away from PointerUnion::{is,get,dyn_cast} (NFC) (#115626)
Note that PointerUnion::{is,get,dyn_cast} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>
2024-11-10 07:24:06 -08:00
Alex Bradbury
5a08acc1e7
[LegalizeTypes] Support softening FMINIMUM/FMAXIMUM (#115463)
Without this, you get an error "Do not know how to soften the result of
this operator!" when compiling for a soft float target.

The libcall names match those defined in glibc
<https://www.gnu.org/software/libc/manual/html_node/Misc-FP-Arithmetic.html>
and more recently added to LLVM's libc
<https://github.com/llvm/llvm-project/pull/86016>.
2024-11-09 10:02:06 +00:00
paperchalice
fe63669282
[Instrumentation] Support MachineFunction in OptNoneInstrumentation (#115471)
Support `MachineFunction` in `OptNoneInstrumentation`, also add
`isRequired` to all necessary passes.
2024-11-09 16:50:11 +08:00
Kazu Hirata
b83399eab6
[GlobalISel] Remove unused includes (NFC) (#115429)
Identified with misc-include-cleaner.
2024-11-08 22:28:47 -08:00
Mirko
5e02fd8d0b
[CodeGen][X86] LiveRangeShrink: fix increment after end (#115276)
This fixes the infinite loop discovered in #114195. 
Since we skip debug instructions at the start of the loop we do not need
to skip them again at the end of the loop.
2024-11-08 20:25:57 -08:00
David Green
e8ce76f1a6
[GlobalISel][AArch64] Allow vector ptr to int unmerges (#115228)
Vector pointer -> scalar integer unmerges are already legal. This
loosens the verifier check for vector-of-pointers -> vectors.
2024-11-08 19:45:08 +00:00
Craig Topper
17f3e00911 Recommit "[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)"
The increase in fallbacks that was previously reported were not caused
by this change.

Original description:

This matches InstCombine and DAGCombine.

RISC-V only has an ADDI instruction so without this we need additional
patterns to do the conversion.

Some of the AMDGPU tests look like possible regressions. Maybe some
patterns from isel aren't imported.
2024-11-08 10:21:46 -08:00
David Sherwood
b9dd60228c
[DAGCombiner] Remove a hasOneUse check in visitAND (#115142)
For some reason there was a hasOneUse check on the splat for the
second operand and it's not obvious to me why. The check blocks
optimisations for lowering of nodes like AVGFLOORU and AVGCEILU.

In a follow-on patch I also plan to improve the generated code
for AVGCEILU further by teaching computeKnownBits about
zero-extending masked loads.
2024-11-08 08:20:31 +00:00
Thorsten Schütt
9061e6e58a
[GlobalISel][AArch64] Legalize G_EXTRACT_VECTOR_ELT for SVE (#115161)
AArch64InstrGISel.td defines:
def : GINodeEquiv<G_EXTRACT_VECTOR_ELT, vector_extract>;

There are many patterns for SVE. Let's exploit that fact.
2024-11-08 07:58:17 +01:00
Pengcheng Wang
ee1608dd8e
[CodeGen][MISched] Set DumpDirection after initPolicy (#115112)
Previously we set the dump direction according to command line
options, but we may override the scheduling direction in `initPolicy`
and this results in mismatch between dump and actual policy.

Here we simply set the dump direction after initializing the policy.
2024-11-08 11:45:36 +08:00
Andrei Safronov
3b1b1271fb
[Xtensa] Implement support for the BranchRelaxation. (#113450) 2024-11-08 01:50:42 +03:00
Valery Pykhtin
9470945b66
[CalcSpillWeights] Simplify copy hint register collection. NFC. (#114236)
CopyHints set has been collecting duplicates of a register with
increasing weight and then deduplicated with HintedRegs set. Let's stop
collecting duplicates at the first place.
2024-11-07 12:52:08 +01:00
abhishek-kaushik22
d2aff182d3
Revert "TLS loads opimization (hoist)" (#114740)
This reverts commit c31014322c0b5ae596da129cbb844fb2198b4ef4.

Based on the discussions in #112772, this pass is not needed after the
introduction of `llvm.threadlocal.address` intrinsic.

Fixes https://github.com/llvm/llvm-project/issues/112771.
2024-11-07 10:10:28 +01:00
Konstantin Schwarz
cbfe87c253
[GlobalISel] Remove references to rhs of shufflevector if rhs is undef (#115076) 2024-11-06 16:36:13 -08:00
Augusto Noronha
f6617d65e4
[DebugInfo] Add num_extra_inhabitants to debug info (#112590)
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.

This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.

Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).

This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
2024-11-06 15:48:04 -08:00
Craig Topper
cff2199e0f Revert "[GISel][AArch64][AMDGPU][RISCV] Canonicalize (sub X, C) -> (add X, -C) (#114309)"
This reverts commit 999dfb2067eb75609b735944af876279025ac171.

I received a report that his may have increased fallbacks on AArch64.
2024-11-06 10:45:23 -08:00
Yingwei Zheng
f74aed7938
[DAGCombiner] Add basic support for trunc nsw/nuw (#113808)
This patch adds basic support for `trunc nsw/nuw` in SDAG. It will allow
DAGCombiner to further eliminate in-reg `zext/sext` instructions.
2024-11-07 00:23:53 +08:00
Benjamin Maxwell
ea6b8fa4b9
[SDAG] Merge multiple-result libcall expansion into DAG.expandMultipleResultFPLibCall() (#114792)
This merges the logic for expanding both FFREXP and FSINCOS into one
method `DAG.expandMultipleResultFPLibCall()`. This reduces duplication
and also allows FFREXP to benefit from the stack slot elimination
implemented for FSINCOS. This method will also be used in future to
implement more multiple-result intrinsics (such as modf and sincospi).
2024-11-06 11:06:06 +00:00
Heejin Ahn
492812f613
[WebAssembly] Fix rethrow's index calculation (#114693)
So far we have assumed that we only rethrow the exception caught in the
innermost EH pad. This is true in code we directly generate, but after
inlining this may not be the case. For example, consider this code:
```ll
ehcleanup:
  %0 = cleanuppad ...
  call @destructor
  cleanupret from %0 unwind label %catch.dispatch
```

If `destructor` gets inlined into this function, the code can be like
```ll
ehcleanup:
  %0 = cleanuppad ...
  invoke @throwing_func
    to label %unreachale unwind label %catch.dispatch.i

catch.dispatch.i:
  catchswitch ... [ label %catch.start.i ]

catch.start.i:
  %1 = catchpad ...
  invoke @some_function
    to label %invoke.cont.i unwind label %terminate.i

invoke.cont.i:
  catchret from %1 to label %destructor.exit

destructor.exit:
  cleanupret from %0 unwind label %catch.dispatch
```

We lower a `cleanupret` into `rethrow`, which assumes it rethrows the
exception caught by the nearest dominating EH pad. But after the
inlining, the nearest dominating EH pad is not `ehcleanup` but
`catch.start.i`.

The problem exists in the same manner in the new (exnref) EH, because it
assumes the exception comes from the nearest EH pad and saves an exnref
from that EH pad and rethrows it (using `throw_ref`).

This problem can be fixed easily if `cleanupret` has the basic block
where its matching `cleanuppad` is. The bitcode instruction `cleanupret`
kind of has that info (it has a token from the `cleanuppad`), but that
info is lost when when we enter ISel, because `TargetSelectionDAG.td`'s
`cleanupret` node does not have any arguments:

5091a359d9/llvm/include/llvm/Target/TargetSelectionDAG.td (L700)
Note that `catchret` already has two basic block arguments, even though
neither of them means `catchpad`'s BB.

This PR adds the `cleanuppad`'s BB as an argument to `cleanupret` node
in ISel and uses it in the Wasm backend. Because this node is also used
in X86 backend we need to note its argument there too but nothing more
needs to change there as long as X86 doesn't need it.

---

- Details about changes in the Wasm backend:

After this PR, our pseudo `RETHROW` instruction takes a BB, which means
the EH pad whose exception it needs to rethrow. There are currently two
ways to generate a `RETHROW`: one is from `llvm.wasm.rethrow` intrinsic
and the other is from `CLEANUPRET` we discussed above. In case of
`llvm.wasm.rethrow`, we add a '0' as a placeholder argument when it is
lowered to a `RETHROW`, and change it to a BB in LateEHPrepare. As
written in the comments, this PR doesn't change how this BB is computed.
The BB argument will be converted to an immediate argument as with other
control flow instructions in CFGStackify.

In case of `CLEANUPRET`, it already has a BB argument pointing to an EH
pad, so it is just converted to a `RETHROW` with the same BB argument in
LateEHPrepare. This will also be lowered to an immediate in CFGStackify
with other control flow instructions.

---

Fixes #114600.
2024-11-05 21:45:13 -08:00
Jon Roelofs
4c3e1e3c4a
[llvm][AsmPrinter] Add an option to print instruction latencies (#113243)
... matching what we have in the disassembler. This isn't turned on by
default since several of the scheduling models are not completely
accurate, and we don't want to be misleading.
2024-11-05 17:28:52 -08:00