36785 Commits

Author SHA1 Message Date
Ellis Hoag
e72209db35
[MachineSink] Fix stable sort comparator (#116705)
Fix the comparator in `stable_sort()` to satisfy the strict weak
ordering requirement.

In https://github.com/llvm/llvm-project/pull/115367 this comparator was
changed to use `getCycleDepth()` when `shouldOptimizeForSize()` is true.
However, I mistakenly changed to logic so that we use `LHSFreq <
RHSFreq` if **either** of them are zero. This causes us to fail the last
requirment (https://en.cppreference.com/w/cpp/named_req/Compare).

> if comp(a, b) == true and comp(b, c) == true then comp(a, c) == true
2024-11-19 16:15:35 -08:00
Yashas Andaluri
b28eebf926
[RDF] Fix cover check when linking refs to defs (#113888)
During RDF graph construction, linkRefUp method links a register ref to
its upward reaching defs until all RegUnits of the ref have been covered
by defs.
However, when a sub-register def covers some, but not all, of the
RegUnits of a previous super-register def, a super-register ref is not
linked to the super-register def.
This can result in certain super register defs being dead code
eliminated.

This patch fixes the cover check for a register ref. A def must be
skipped only when all RegUnits of that def have already been covered by
a previously seen def.
2024-11-19 12:38:36 -06:00
Zaara Syeda
8e4423eb08
[AsmPrinter] Fix handling in emitGlobalConstantImpl for AIX (#116255)
When GlobalMerge creates a MergedGlobal of statics all initialized to
zero, emitGlobalConstantImpl sees a ConstantAggregateZero. This results
in just emitting zeros followed by labels for the aliases. We need to
handle it more like how emitGlobalConstantStruct does by emitting each
global inside the aggregate.

---------

Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
2024-11-19 09:58:25 -05:00
Sander de Smalen
3093b29b59
[RegisterCoalescer] Fix up subreg lanemasks after rematerializing. (#116191)
In a situation like the following:

```
   undef %2.subreg = INST %1         ; DefMI (rematerializable),
                                     ; DefSubIdx = subreg
   %3 = COPY %2                      ; SrcIdx = DstIdx = 0
   .... = SOMEINSTR %3, %2
```
there are no subranges for `%3` because the entire register is copied,
but after rematerialization the subrange of the rematerialized value
must be fixed up with the appropriate subranges for `.subreg`.

(To me this issue seemed a bit similar to the issue fixed by #96839, but
then related to rematerialization)
2024-11-19 08:46:55 +00:00
Shubham Sandeep Rastogi
e914d97327 Revert "[NFC] Move DroppedVariableStats to its own file and redesign it to be extensible. (#115563)"
This reverts commit 2de78815604e9027efd93cac27c517bf732587d2.

Reverted due to buildbot failure:

unittests/IR/CMakeFiles/IRTests.dir/DroppedVariableStatsIRTest.cpp.o:DroppedVariableStatsIRTest.cpp:function llvm::DroppedVariableStatsIR::runAfterPass(llvm::StringRef, llvm::Any): error: undefined reference to 'llvm::DroppedVariableStatsIR::runOnModule(llvm::Module const*, bool)'
2024-11-18 16:05:09 -08:00
Shubham Sandeep Rastogi
81924ac1fb Revert "Add a pass to collect dropped var stats for MIR. (#115566)"
This reverts commit 6e2b77d4696d4a672635c0ba1ead4824e2158a7d.

Reverting due to buildbot failure:

unittests/IR/CMakeFiles/IRTests.dir/DroppedVariableStatsIRTest.cpp.o:DroppedVariableStatsIRTest.cpp:function llvm::DroppedVariableStatsIR::runAfterPass(llvm::StringRef, llvm::Any): error: undefined reference to 'llvm::DroppedVariableStatsIR::runOnModule(llvm::Module const*, bool)'
2024-11-18 16:05:09 -08:00
Shubham Sandeep Rastogi
6e2b77d469
Add a pass to collect dropped var stats for MIR. (#115566)
This patch uses the DroppedVariableStats class to add dropped variable
statistics for MIR passes.
2024-11-18 15:56:06 -08:00
Shubham Sandeep Rastogi
2de7881560
[NFC] Move DroppedVariableStats to its own file and redesign it to be extensible. (#115563)
Move DroppedVariableStats code to its own file and change the class to
have an extensible design so that we can use it to add dropped
statistics to MIR passes and the instruction selector.
2024-11-18 15:48:53 -08:00
Daniel Sanders
2310e3e3f2
[GlobalISel] Move DemandedElt's APInt size assert after isValid() check (#115979)
This prevents the assertion from wrongly triggering on invalid LLT's
2024-11-18 15:39:28 -08:00
Thorsten Schütt
f8d1905a24
[GlobalISel] Combine [S,U]SUBO (#116489)
We import the llvm.ssub.with.overflow.* Intrinsics, but the Legalizer
also builds them while legalizing other opcodes, see narrowScalarAddSub.
2024-11-18 22:39:23 +01:00
Lei Huang
ed8ebad6eb
[SelectionDAG] Support integer promotion for VP_LOAD and VP_STORE (#81299)
Add integer promotion support for for VP_LOAD and VP_STORE via legalization of extend
and truncate of each form.

Patch commandeered from: https://reviews.llvm.org/D109377
2024-11-18 13:32:58 -05:00
Ellis Hoag
c9260e21d0
[CodeLayout] Do not rebuild chains with -apply-ext-tsp-for-size (#115934)
https://github.com/llvm/llvm-project/pull/109711 disables
`buildCFGChains()` when `-apply-ext-tsp-for-size` is used to improve
codesize. Tail merging can change the layout and normally requires
`buildCFGChains()` to be called again, but we want to prevent this when
optimizing for codesize. We saw slight size improvement on large
binaries with this change. If `-apply-ext-tsp-for-size` is not used,
this should be a NFC.
2024-11-18 09:16:09 -08:00
Akshat Oke
3f9d02aae8
[CodeGen][NewPM] Port PeepholeOptimizer to NPM (#116326)
With this, all machine SSA optimization passes are available in the new codegen pipeline.
2024-11-18 11:02:01 +05:30
Akshat Oke
00aa08119a
[NFC] Clang format PeepholeOptimizer (#116325) 2024-11-18 10:58:48 +05:30
Brandon Wu
206ee71918
[RISCV] Change vector tuple type's TypeSize to scalable (#114329)
Vector tuple is basically multiple grouped vector, so its size is also
determined by vscale, we need not to model it as a vector type but its
size need to be scalable.
2024-11-17 18:52:49 +08:00
David Green
549413fa40 [AArch64][GlobalISel] Protect against folding loads across basic blocks.
isObviouslySafeToFold can look between a load and an instruction it can be
folded into, to check that no other memory operations prevents the fold. It
doesn't handle multiple basic blocks which we needs to guard against.
2024-11-16 19:52:44 +00:00
Simon Pilgrim
51809e4a26
[DAG] SimplifyDemandedVectorElts - add SimplifyMultipleUse handling to SEXT/ZEXT/TRUNC nodes (#116227)
Allows us to bypass multiple uses of a SEXT/ZEXT/TRUNC node operand
2024-11-16 12:40:42 +00:00
Thorsten Schütt
2906fcadb8
[GlobalISel] Combine G_MERGE_VALUES of x and zero (#116283)
into zext x

LegalizerHelper has two padding strategies: undef or zero.

see LegalizerHelper:273
see LegalizerHelper:315

This PR is about zero sugar and Coke Zero.

; CHECK-NEXT: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES %a(s32),
[[C]](s32)

Please continue padding merge values.

// %bits_8_15:(s8) = G_CONSTANT i8 0
// %0:(s16) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8)

%bits_8_15 is defined by zero. For optimization, we pick zext.

// %0:_(s16) = G_ZEXT %bits_0_7:(s8)

The upper bits of %0 are zero and the lower bits come from %bits_0_7.
2024-11-16 08:00:21 +01:00
Craig Topper
131d73ed34
[RegAlloc] Remove redundant prints of LiveInterval weight. (#116451)
LiveInterval::print has included the weight since early 2018. We don't
need to print again after we print the interval.
2024-11-15 16:43:30 -08:00
Kyungwoo Lee
816c975ea7
Fix crash from [CGData] Global Merge Functions (#112671) (#116241)
Module summary index is optional for this pass, and we shouldn't run it,
but import it as necessary.
2024-11-15 14:57:17 -08:00
Craig Topper
47a0e24a3b
[GISel][RISCV] Add G_SMIN/SMAX/UMIN/UMAX to GISelKnownBits::computeNumSignBits. (#116321) 2024-11-15 11:23:15 -08:00
Alex Bradbury
298127dcbe Reapply [IR] Initial introduction of llvm.experimental.memset_pattern (#97583)
Relands 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3 after regenerating the
test case.

Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't require that the number of bytes to set is a compile-time
constant
* Define llvm.memset_pattern rather than llvm.memset_pattern.inline

As discussed in the [RFC
thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496),
the intent is that the intrinsic will be lowered to loops, a sequence of
stores, or libcalls depending on the expected cost and availability of
libcalls on the target. Right now, there's just a single lowering path
that aims to handle all cases. My intent would be to follow up with
additional PRs that add additional optimisations when possible (e.g.
when libcalls are available, when arguments are known to be constant
etc).
2024-11-15 15:21:39 +00:00
Alex Bradbury
0fb8fac5d6 Revert "[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)"
This reverts commit 7ff3a9acd84654c9ec2939f45ba27f162ae7fbc3.

Recent scheduling changes means tests need to be re-generated. Reverting
to green while I do that.
2024-11-15 14:48:32 +00:00
Alex Bradbury
7ff3a9acd8
[IR] Initial introduction of llvm.experimental.memset_pattern (#97583)
Supersedes the draft PR #94992, taking a different approach following
feedback:
* Lower in PreISelIntrinsicLowering
* Don't require that the number of bytes to set is a compile-time
constant
* Define llvm.memset_pattern rather than llvm.memset_pattern.inline

As discussed in the [RFC
thread](https://discourse.llvm.org/t/rfc-introducing-an-llvm-memset-pattern-inline-intrinsic/79496),
the intent is that the intrinsic will be lowered to loops, a sequence of
stores, or libcalls depending on the expected cost and availability of
libcalls on the target. Right now, there's just a single lowering path
that aims to handle all cases. My intent would be to follow up with
additional PRs that add additional optimisations when possible (e.g.
when libcalls are available, when arguments are known to be constant
etc).
2024-11-15 14:07:46 +00:00
Haojian Wu
878b03e0b9 Remove an unused Passes include from CodeGen/RegUsageInfoPropagate.cpp
CodeGen should not depend on Passes component.
2024-11-15 08:40:45 +01:00
Akshat Oke
47928ab16b
[CodeGen][NewPM] Port RegUsageInfoPropagation pass to NPM (#114010) 2024-11-15 12:06:02 +05:30
Akshat Oke
2de1e06736
[CodeGen][NewPM] Port RegUsageInfoCollector pass to NPM (#113874) 2024-11-15 12:00:09 +05:30
Akshat Oke
7b54976d11
[CodeGen][NewPM] Port RegisterUsageInfo to NPM (#113873)
And add to the codegen pipeline if ipra is enabled with a `RequireAnalysisPass` since this is a module pass.
2024-11-15 10:49:00 +05:30
Carlos Alberto Enciso
c2a9bba4a3
[DebugInfo] Add DW_AT_artificial for compiler generated static member. (#115851)
Consider the case when the compiler generates a static member. Any
consumer of the debug info generated for that case, would benefit if
that member has the DW_AT_artificial flag.
2024-11-15 05:16:12 +00:00
Kyungwoo Lee
b3134fa233 Reland [CGData] Refactor Global Merge Functions (#115750)
This is a follow-up PR to refactor the initial global merge function
pass implemented in #112671.

It first collects stable functions relevant to the current module and
iterates over those only, instead of iterating through all stable
functions in the stable function map.

This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-14 15:27:17 -08:00
Konstantin Schwarz
0f0e2fe97b
[GlobalISel] Turn shuffle a, b, mask -> shuffle undef, b, mask iff mask does not reference a (#115377) 2024-11-14 15:13:41 -08:00
Matin Raayai
bb3f5e1fed
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.

cc @arsenm @aeubanks
2024-11-14 13:30:05 -08:00
Graham Hunter
ed5aaddd7b
[IR] Vector extract last active element intrinsic (#113587)
As discussed in #112738, it may be better to have an intrinsic to represent vector element extracts based on mask bits. This intrinsic is for the case of extracting the last active element, if any, or a default value if the mask is all-false.

The target-agnostic SelectionDAG lowering is similar to the IR in #106560.
2024-11-14 17:48:43 +00:00
Sander de Smalen
be15fd5085
[InitUndef] handleSubReg should skip artificial subregs. (#116248)
When enabling subreg liveness tracking for AArch64, this pass fails
because it tries to get the register class for the artificial subreg
`sub_32_hi` of a 64-bit GPR. It tries to create an INIT_UNDEF
instruction for the top 32-bits of the 64-bit GPR, which are not
directly addressable, so getSubRegisterClass() returns a nullptr,
crashing this pass.

It should instead just avoid trying to create the INIT_UNDEF
instruction.
2024-11-14 17:06:40 +00:00
Jay Foad
6cb1847815 Fix typo "necessarilly" 2024-11-14 17:01:17 +00:00
Jeremy Morse
251958f357 [DebugInfo] Don't pick prologue_end if there are no instructions
Add a filter to avoid picking prologue_end when a function is empty (it may
have blocks but no instructions). This saves us from pushing more
validity-checking into findPrologueEndLoc.
2024-11-14 13:41:50 +00:00
Akshat Oke
43bef75fd6
[NFC][CodeGen] Clang format MachineSink.cpp (#114027)
Preparing to port this pass to new pass manager.
2024-11-14 18:48:35 +05:30
Sam Elliott
862f42eedf
[TargetLowering] Use Correct VT for Multi-out Asm (#116024)
This was overlooked in 7d940432c46be83b8fcb5dbefee439585fa820cd - when
inline assembly has multiple outputs, they are returned as members of a
struct, and the `getAsmOperandType` needs to be called for each member
of struct. The difference between this and the single-output case is
that in the latter, there isn't a struct wrapping the outputs.

I noticed this when trying to use the same mechanism in the RISC-V
backend.

Committing two tests:
- One that shows a crash before this change, which is fixed by this
change.
- One (commented out) that shows a different crash with tied
inputs/outputs. This is commented as it is not fixed by this change and
needs more work in target-independent inline asm handling code.
2024-11-14 12:31:31 +00:00
Jeremy Morse
b468ed494a Reapply ccddb6ffad1, "Emit a worst-case prologue_end"
In 39b2979a4 Pavel has kindly refined the implementation of a test in such
a way that it doesn't trip up over this patch -- the test wishes to
stimulate LLDBs presentation of line0 locations, rather than wanting to
always step on line-zero on entry to artificial_location.c. As that's what
was tripping up this change, reapply.

Original commit message follows.

[DWARF] Emit a worst-case prologue_end flag for pathological inputs (#107849)

prologue_end usually indicates where the end of the function-initialization
lies, and is where debuggers usually choose to put the initial breakpoint
for a function. Our current algorithm piggy-backs it on the first available
source-location: which doesn't necessarily have anything to do with the
start of the function.

To avoid this in heavily-optimised code that lacks many useful source
locations, pick a worst-case "if all else fails" prologue_end location, of
the first instruction that appears to do meaningful computation. It'll be
given the function-scope line number, which should run-on from the start of
the function anyway. This means if your code is completely inverted by the
optimiser, you can at least put a breakpoint at the _start_ like you
expect, even if it's difficult to then step through.

This patch also attempts to preserve some good behaviour we have without
optimisations -- at O0, if the prologue immediately falls into a loop body
without any computation happening, then prologue_end lands at the start of
that loop. This is desirable; but does mean we need to do more work to
detect and support those situations.
2024-11-14 10:30:17 +00:00
Ricardo Jesus
e52238b59f
[AArch64] Add @llvm.experimental.vector.match (#101974)
This patch introduces an experimental intrinsic for matching the
elements of one vector against the elements of another.

For AArch64 targets that support SVE2, the intrinsic lowers to a MATCH
instruction for supported fixed and scalar vector types.
2024-11-14 09:00:19 +00:00
Kyungwoo Lee
5a2888ddbd Revert "[CGData] Refactor Global Merge Functions (#115750)"
This reverts commit d3da78863c7021fa2447a168dc03ad791db69dc6.
2024-11-13 21:23:16 -08:00
Kyungwoo Lee
d3da78863c
[CGData] Refactor Global Merge Functions (#115750)
This is a follow-up PR to refactor the initial global merge function
pass implemented in #112671.

It first collects stable functions relevant to the current module and
iterates over those only, instead of iterating through all stable
functions in the stable function map.

This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-13 21:15:19 -08:00
alx32
f407dff50c
[DebugInfo][DWARF] Emit Per-Function Line Table Offsets and End Sequences (#110192)
**Summary**

This patch introduces a new compiler option `-mllvm
-emit-func-debug-line-table-offsets` that enables the emission of
per-function line table offsets and end sequences in DWARF debug
information. This enhancement allows tools and debuggers to accurately
attribute line number information to their corresponding functions, even
in scenarios where functions are merged or share the same address space
due to optimizations like Identical Code Folding (ICF) in the linker.

**Background**
RFC: [New DWARF Attribute for Symbolication of Merged
Functions](https://discourse.llvm.org/t/rfc-new-dwarf-attribute-for-symbolication-of-merged-functions/79434)

Previous similar PR:
[#93137](https://github.com/llvm/llvm-project/pull/93137) – This PR was
very similar to the current one but at the time, the assembler had no
support for emitting labels within the line table. That support was
added in PR [#99710](https://github.com/llvm/llvm-project/pull/99710) -
and in this PR we use some of the support added in the assembler PR.

In the current implementation, Clang generates line information in the
`debug_line` section without directly associating line entries with
their originating `DW_TAG_subprogram` DIEs. This can lead to issues when
post-compilation optimizations merge functions, resulting in overlapping
address ranges and ambiguous line information.

For example, when functions are merged by ICF in LLD, multiple functions
may end up sharing the same address range. Without explicit linkage
between functions and their line entries, tools cannot accurately
attribute line information to the correct function, adversely affecting
debugging and call stack resolution.


**Implementation Details**
To address the above issue, the patch makes the following key changes:

**`DW_AT_LLVM_stmt_sequence` Attribute**: Introduces a new LLVM-specific
attribute `DW_AT_LLVM_stmt_sequence` to each `DW_TAG_subprogram` DIE.
This attribute holds a label pointing to the offset in the line table
where the function's line entries begin.

**End-of-Sequence Markers**: Emits an explicit DW_LNE_end_sequence after
each function's line entries in the line table. This marks the end of
the line information for that function, ensuring that line entries are
correctly delimited.

**Assembler and Streamer Modifications**: Modifies the MCStreamer and
related classes to support emitting the necessary labels and tracking
the current function's line entries. A new flag
GenerateFuncLineTableOffsets is added to control this behavior.

**Compiler Option**: Introduces the `-mllvm
-emit-func-debug-line-table-offsets` option to enable this
functionality, allowing users to opt-in as needed.
2024-11-13 18:51:34 -08:00
Kyungwoo Lee
d23c5c2d65
[CGData] Global Merge Functions (#112671)
This implements a global function merging pass. Unlike traditional
function merging passes that use IR comparators, this pass employs a
structurally stable hash to identify similar functions while ignoring
certain constant operands. These ignored constants are tracked and
encoded into a stable function summary. When merging, instead of
explicitly folding similar functions and their call sites, we form a
merging instance by supplying different parameters via thunks. The
actual size reduction occurs when identically created merging instances
are folded by the linker.

Currently, this pass is wired to a pre-codegen pass, enabled by the
`-enable-global-merge-func` flag.
In a local merging mode, the analysis and merging steps occur
sequentially within a module:
- `analyze`: Collects stable function hashes and tracks locations of
ignored constant operands.
- `finalize`: Identifies merge candidates with matching hashes and
computes the set of parameters that point to different constants.
- `merge`: Uses the stable function map to optimistically create a
merged function.

We can enable a global merging mode similar to the global function
outliner
(https://discourse.llvm.org/t/rfc-enhanced-machine-outliner-part-2-thinlto-nolto/78753/),
which will perform the above steps separately.
- `-codegen-data-generate`: During the first round of code generation,
we analyze local merging instances and publish their summaries.
- Offline using `llvm-cgdata` or at link-time, we can finalize all these
merging summaries that are combined to determine parameters.
- `-codegen-data-use`: During the second round of code generation, we
optimistically create merging instances within each module, and finally,
the linker folds identically created merging instances.

Depends on #112664
This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-13 17:34:07 -08:00
Alex MacLean
7a8fe0f83c
[SelectionDAG] Fixup type usage of CondCodeAction table (#116082)
Ensure that all uses of CondCodeAction table are checking the compared
types, not the produced type. This is a prerequisite to landing #115035
2024-11-13 13:20:16 -08:00
Matt Arsenault
04d450fd8d
AtomicExpand: Preserve metadata when bitcasting fp atomicrmw xchg (#115240) 2024-11-13 12:51:18 -08:00
Augusto Noronha
67fb2686fb
[DebugInfo] Add a specification attribute to LLVM DebugInfo (#115362)
Add a specification attribute to LLVM DebugInfo, which is analogous
to DWARF's DW_AT_specification. According to the DWARF spec:
"A debugging information entry that represents a declaration that
completes another (earlier) non-defining declaration may have a
DW_AT_specification attribute whose value is a reference to the
debugging information entry representing the non-defining declaration."

This patch allows types to be specifications of other types. This is
used by Swift to represent generic types. For example, given this Swift
program:

```
struct MyStruct<T> {
    let t: T
}

let variable = MyStruct<Int>(t: 43)
```

The Swift compiler emits (roughly) an unsubtituted type for MyStruct<T>:
```
DW_TAG_structure_type
    DW_AT_name	("MyStruct")
    // "$s1w8MyStructVyxGD" is a Swift mangled name roughly equivalent to 
    // MyStruct<T>
    DW_AT_linkage_name	("$s1w8MyStructVyxGD")
    // other attributes here
```
And a specification for MyStruct<Int>:
```
DW_TAG_structure_type
    DW_AT_specification	(<link to "MyStruct">)
    // "$s1w8MyStructVySiGD" is a Swift mangled name equivalent to
    // MyStruct<Int>
    DW_AT_linkage_name	("$s1w8MyStructVySiGD")
    DW_AT_byte_size	(0x08)
    // other attributes here
```
2024-11-13 09:55:37 -08:00
Jay Foad
a33ae1b7df
[LiveRangeCalc] Fix isJointlyDominated (#116020)
Check that every path from the entry block to the use block passes
through at least one def block. Previously we only checked that at least
one path passed through a def block.
2024-11-13 13:36:48 +00:00
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
Thorsten Schütt
0e97b4d05a
[GlobalISel] Combine G_MERGE_VALUES of x and undef (#113616)
into anyext x

; CHECK-NEXT: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s32),
[[DEF]](s32)

Please continue padding merge values.

//   %bits_8_15:_(s8) = G_IMPLICIT_DEF
//   %0:_(s16) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8)

%bits_8_15 is defined by undef. Its value is undefined and we can pick
an arbitrary value. For optimization, we pick anyext, which plays well
with the undefinedness.

//   %0:_(s16) = G_ANYEXT %bits_0_7:(s8)

The upper bits of %0 are undefined and the lower bits come from
%bits_0_7.
2024-11-12 23:23:32 +01:00