InstCombine can transform ADD+GEP into GEP+GEP. But those rewrites does
not currently trigger when the ADD is a disjoint OR (which happens to be
the canonical form for certain ADD operations). Add lit tests to show
that we are lacking such rewrites.
Also add a test case showing that we do not preserve "inbounds nuw",
"nusw nuw" and "nuw" when doing such transforms and the ADD/OR is
known to be NUW.
Reapplies #134068.
The first patch was missing a check to prevent attempts to pair SVE
fill/spill with other Neon load/store instructions, which could happen
specifically if the Neon instruction was unscaled.
Previously, `cc1as` did not consider the Features that can be included
from a target's FPU. This could lead to a situation where assembly files
could not compile as cc1as did not know if a feature was supported.
With this change, all the features for the FPU will be passed to `cc1as`
as `-target-feature` lines. By making this change, it will enable
`+nosimd` to be functional, worked on in #130623, and fix a regression
introduced in 8fa0f0efce5fb81eb422e6d7eec74c66dafef4a3 so
armv7s-apple-darwin targets can utilise VFPv4 correctly.
---------
Co-authored-by: Martin Storsjö <martin@martin.st>
This patch addresses three problems when promoting allocas to vectors:
- Element types with size < 1 byte in allocas with a vector type caused
divisions by zero.
- Element types whose size doesn't match their AllocSize hit an assertion.
- Access types whose size doesn't match their AllocSize hit an assertion.
With this patch, we do not attempt to promote affected allocas to vectors. In
principle, we could handle these cases in PromoteAlloca, e.g., by truncating
and extending elements from/to their allocation size. It's however unclear if
we ever encounter such cases in practice, so that doesn't seem worth the added
complexity.
For SWDEV-511252
With https://github.com/llvm/llvm-project/pull/112852, we claimed that
llvm.minnum and llvm.maxnum should treat +0.0>-0.0, while libc doesn't
require fmin(3)/fmax(3) for it.
To make llvm.minnum/llvm.maxnum easy to use, we define the builtin
functions for them, include
__builtin_elementwise_minnum
__builtin_elementwise_maxnum
All of them support _Float16, __bf16, float, double, long double.
We add comment markers and print enum names instead of numbers.
For required extensions, we print the feature list instead of raw
bits.
This recommits d0cf5cd which was reverted by 21ff45d.
This reverts commit d0cf5cd5f9790dc21396936d076389c3be1a9599.
Error: "declaration of ‘clang::RISCV::RequiredExtensions
{anonymous}::SemaRecord::RequiredExtensions’ changes meaning of
‘RequiredExtensions’ [-fpermissive]"
Among fixupNeedsRelaxationAdvanced (introduced by
https://reviews.llvm.org/D8217) targets, only Hexagon needs the
`MCRelaxableFragment` parameter (commit
86f218e7ec5d941b7785eaebcb8f4cad76db8a64) to get the instruction packet
(MCInst with sub-instruction operands).
As fixupNeedsRelaxationAdvanced follows mayNeedRelaxation, we can store
the MCInst in mayNeedRelaxation and eliminate the MCRelaxableFragment
parameter.
Follow-up to 7c83b7ef1796210451b839f4c58f2815f4aedfe5 that eliminates
the MCRelaxableFragment parameter from fixupNeedsRelaxation.
This commit improves the `EnumProp` class, causing it to wrap around an
`EnumInfo` just like` EnumAttr` does. This EnumProp also has logic for
converting to/from an integer attribute and for being read and written
as bitcode.
The following variants of `EnumProp` are provided:
- `EnumPropWithAttrForm` - an EnumProp that can be constructed from (and
will be converted to, if `storeInCustomAttribute` is true) a custom
attribute, like an `EnumAttr`, instead of a plain integer. This is meant
for backwards compatibility with code that uses enum attributes.
`NamedEnumProp` adds a "`mnemonic` `<` $enum `>`" syntax around the
enum, replicating a common pattern seen in MLIR printers and allowing
for reduced ambiguity.
`NamedEnumPropWithAttrForm` combines both of these extensions.
(Sadly, bytecode auto-upgrade is hampered by the lack of the ability to
optionally parse an attribute.)
Depends on #132148
If a hot callsite function is not inlined in the 1st build, inlining the
hot callsite in pre-link stage of SPGO 2nd build may lead to Function
Sample not found in profile file in link stage. It will miss some
profile info.
ThinLTO has already considered and dealed with it by setting
HotCallSiteThreshold to 0 to stop the inline. This patch just adds the
same processing for FullLTO.
Follow-up to commits 5710759eb390c0d5274c2a4d43967282d7df1993
and 634f9a981571eae000c1adc311014c5c64486187
- Integrate `evaluateFixup` into `recordRelocation` and inline code
within `MCAssembler::layout`, removing `handleFixup`.
- Update `fixupNeedsRelaxation` to bypass `shouldForceRelocation` when
calling `evaluateFixup`, eliminating the `WasForced` workaround for
RISC-V linker relaxation (https://reviews.llvm.org/D46350 ).
This prepares for the upcoming change to simplify relocation recording
in MCAssembler.
While both MCAssembler::fixupNeedsRelaxation and
MCAssembler::handleFixup call evaluateFixup and use
shouldForceRelocation, the shouldForceRelocation logic is not supposed
to be needed by MCAssembler::fixupNeedsRelaxation.
The ARM special cases for interworking branches
(https://reviews.llvm.org/D33436 and https://reviews.llvm.org/D33898)
break the assumption. Switch to fixupNeedsRelaxationAdvanced and
explicitly test the conditions.
When a frame is inlined, LLDB will display its name in backtraces as
follows:
```
* thread #1, queue = 'com.apple.main-thread', stop reason = breakpoint 1.3
* frame #0: 0x0000000100000398 a.out`func() [inlined] baz(x=10) at inline.cpp:1:42
frame #1: 0x0000000100000398 a.out`func() [inlined] bar() at inline.cpp:2:37
frame #2: 0x0000000100000398 a.out`func() at inline.cpp:4:15
frame #3: 0x00000001000003c0 a.out`main at inline.cpp:7:5
frame #4: 0x000000026eb29ab8 dyld`start + 6812
```
The longer the names get the more confusing this gets because the first
function name that appears is the parent frame. My assumption (which may
need some more surveying) is that for the majority of cases we only care
about the actual frame name (not the parent). So this patch removes all
the special logic that prints the parent frame.
Another quirk of the current format is that the inlined frame name does
not abide by the `${function.name-XXX}` format variables. We always just
print the raw demangled name. With this patch, we would format the
inlined frame name according to the `frame-format` setting (see the
test-cases).
If we really want to have the `parentFrame [inlined] inlinedFrame`
format, we could expose it through a new `frame-format` variable (e..g.,
`${function.inlined-at-name}` and let the user decide where to place
things.
This parameter eliminates a redundant computation for VK_ABS8 in X86 and
reduces reliance on shouldForceRelocation in relaxation decisions.
Note: `local: jmp local@plt` relaxes JMP. This behavior depends on
fixupNeedsRelaxation calling shouldForceRelocation, which might change
in the future.
Both the `CPlusPlusLanguage` plugins and the Swift language plugin
already assume the `sc != nullptr`. And all `FormatEntity` callsites of
`GetFunctionDisplayName` already check for nullptr before passing `sc`.
This patch makes this pre-condition explicit by changing the parameter
to `const SymbolContext &`. This will help with some upcoming changes in
this area.
Removed the calls to `sizeOp` after replacing `SliceOp`:
```
// Remove const_shape size op when it no longer has use point.
Operation *sizeConstShape = sliceOp.getSize().getDefiningOp();
```
Turns out as part of canonicalization, trivially dead ops are removed
anyway, so the above piece of code isn't actually needed.
SVE Operations such as predicated loads become canonicalized to LLVM
masked loads, and doing the same for ptrue(all) to splat(1) creates
further optimization opportunities from generic LLVM IR passes.