The PR removes the unnecessary division and modulo operations in the
one-word specialization `__bitset<1, _Size>`. The reason is that for the
one-word specialization, we have `__pos < __bits_per_word` (as
`__bitset<1, _Size>` is an implementation detail only used by the public
`bitset`). So `__pos / __bits_per_word == 0` and `__pos / __pos %
__bits_per_word == __pos`.
Follow the naming of most other backends.
> "Relocation modifier" suggests adjustments happen during the linker's relocation step rather than the assembler's expression evaluation.
> "Relocation specifier" is clear, aligns with Arm and IBM AIX's documentation, and fits the assembler's role seamlessly.
In addition, rename `AArch64MCExpr::Kind` and `getKind`, which confusingly shadow the base class `Kind` and `getKind`.
DarwinRefKind, which still uses MCSymbolRefExpr::VariantKind, is not renamed.
Pull Request: https://github.com/llvm/llvm-project/pull/132595
This just moves the x86 implementation into generic code since it appears
to be suitable for any target. The heart of this transform is inside
foldMemoryOperand so other targets won't actually kick in until they
implement said API. This just removes one piece to implement in the
process of enabling foldMemoryOperand.
Reverts llvm/llvm-project#132587
Due to causing test failures on several of Linaro's buildbots. Several
MLIR test failures and at least one test timing out.
I doubt it's the patch itself, but instead an issue it has uncovered.
Revert while we dig into that.
this PR fixes#91835.
For `DeclRefExpr` in lambda's function body, it will references to
original variable declaration in AST rather than `FieldDecl` for lambda
class, so it's needed to find the corresponding `FieldDecl` and bind
`DeclRefExpr`'s value to it.
This is already implemented for variables that are not in a structured
binding structure, so I extracted that part of the code so that it can
be used in the structured binding case.
So that LoopStrengthReduce can work for MIPS.
The code is copied from RISC-V.
---------
Co-authored-by: qethu <190734095+qethu@users.noreply.github.com>
Add double overloads which cast the double to a float and call the float
builtin.
Makes these double overloads conditional on hlsl version 202x or
earlier.
Add tests
Closes#128228
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E);
down to:
Set.insert_range(Range);
In some cases, we can further fold that into the set declaration.
We can use *Set::insert_range to collapse:
for (auto Elem : Range)
Set.insert(E);
down to:
Set.insert_range(Range);
In some cases, we can further fold that into the set declaration.
Let's use the `.space` directive to simplify AVR's branch relaxation
tests, as noticed by @MaskRay at
https://github.com/llvm/llvm-project/pull/118015#issuecomment-2728254164.
Having `.space 4100` instead of `.space 4096` is little bit awkward, but
I'd rather avoid changing two things at once (i.e. current assertions
are fine).
Summary:
We conditionally allocate the implicit arguments, so they possibly are
null. The flang compiler seems to hit this case, even though it
shouldn't when it's supposed to conform to the HSA code object. For now
guard this to fix the regression and cover a case in the future where
someone rolls a fully custom implementatation.
Fixes: https://github.com/llvm/llvm-project/issues/132982
Using `blockaddress` should be more reliable than determining if an
operand comes from a jump table index.
Alternative: Add the `MachineInstr::MIFlag::ComputedGoto` flag when
lowering `indirectbr`. But I don't think this approach is suitable to
backport.
This is the second of two PRs that attempts to improve the IR
generated in the exit blocks of vectorised loops with uncountable
early exits. It follows on from PR #128880. In this PR I am
improving the generated code for users of induction variables in
early exit blocks.
This required using a newly add VPInstruction called
FirstActiveLane, which calculates the index of the first active
predicate in the mask operand.
I have added a new function optimizeEarlyExitInductionUser that
is called from optimizeInductionExitUsers when handling users in
early exit blocks.
Clang is required to compile Flang. Instead of erroring-out if Clang is
enabled, for convinience implicitly add it to `LLVM_ENABLE_PROJECTS`,
consistent with how the MLIR dependency is handled.
This is motivatated by the discussion on whether flang-rt shoud be enabled implicitly
(https://discourse.llvm.org/t/buildbot-changes-with-llvm-enable-runtimes-flang-rt/83571/2).
Since the answer was yet, clang would have been the only exception of not being
enabled implicitly. Fixed with this commit.
When revisiting a variable, we do that by simply calling visitDecl() for
it, which means it will end up with the same EvalID as the rest of the
evaluation - but this way we end up allowing reads from mutable
variables. Disallow that.
Fold EXTRACT_SUBVECTOR(INSERT_SUBVECTOR(SRC,SUB,C1),C2) ->
INSERT_SUBVECTOR(EXTRACT_SUBVECTOR(SRC,C2),SUB,C1-C2)
This extends the existing fold which required the extract/insert
subvector indices to match - now it will always extract as long as the
original inserted subvector is entirely contained within the extraction.
Helps avoid unnecessary use of 512-bit vectors, and improves the chance
of concatenation folds.
Due to `amdgcnspirv` piggybacking on the HIPAMDToolchain, it loses its
immediately apparent SPIR-Vness, which makes the Driver want to go all
the way to Assembly emmission. This was problematic as we were trying to
`llvm-link` SPIR-V, before trying to translate it. This patch ensures
that we do the right thing and stop at bitcode emission if we are mixing
`amdgcnspirv` and concrete targets.
Relands #129502.
Previously when the framework encountered unsupported values (such as
enum classes), they were always treated as equal when comparing with
`==`, regardless of their actual values being different.
Now the two sides are only equal if there's a Value assigned to them.
Added handling for the special case of `nullptr == nullptr`, to preserve
the behavior of untyped `nullptr` having no value.
The link should refer to the section of 'phi' Instruction in the
LangRef, but it referred to the subsection of 'fcmp' Instruction.
Replace it with appropriate one.
in ASTWriter
It is bad to iterate Dense{Map,Set} in ASTWriter. Since the order in
Dense{Map, Set} is not stable. It may cause the produced BMI differ
even if we run the compiler twice without modifying any other thing.
After #130665 these operations are scalarized to avoid
double-rounding. This updates the cost model to match.
In the future we might be able to use SVE instructions to help, but for
the moment the costs should be higher. Costsize and Latency costs are
not yet expected to be accurate. The vector insert/extract will use the
cost of VectorInsertExtractBaseCost (2 by default).
…723)"
This reverts commit 1e2ad6793ac205607e7c809283cf69e1cc36a69a.
Fix the previous commit on big-endian hosts by _not_ falling through to
the `uint8_t` code path.
The condition of assert is always true, so just remove it.
OptForMinSize means hasMinSize(), which is
hasFnAttribute(Attribute::MinSize).
hasOptSize() is hasFnAttribute(Attribute::OptimizeForSize) ||
hasMinSize().
So, '!hasMinSize() || hasFnAttribute(Attribute::OptimizeForSize) ||
hasMinSize()' is awalys true.
---------------------------------
llvm/include/llvm/IR/Function.h
```
/// Optimize this function for minimum size (-Oz).
bool hasMinSize() const { return hasFnAttribute(Attribute::MinSize); }
/// Optimize this function for size (-Os) or minimum size (-Oz).
bool hasOptSize() const {
return hasFnAttribute(Attribute::OptimizeForSize) || hasMinSize();
}
```