This is a rebase of #95112 with my own feedback apply as @MitalAshok has
been inactive for a while.
It's fairly important this makes clang 20 as it is a blocker for #107451
---
[CWG2813](https://cplusplus.github.io/CWG/issues/2813.html)
prvalue.member_fn(expression-list) now will not materialize a temporary
for prvalue if member_fn is an explicit object member function, and
prvalue will bind directly to the object parameter.
The E1 in E1.static_member is now a discarded-value expression, so if E1
was a call to a [[nodiscard]] function, there will now be a warning.
This also affects C++98 with [[gnu::warn_unused_result]] functions.
This should not affect C where TemporaryMaterializationConversion is a
no-op.
Closes#100314Fixes#100341
---------
Co-authored-by: Mital Ashok <mital@mitalashok.co.uk>
Without this patch ScalableVector frame index property is used before
assignment. More precisely, let's take a look at
RISCVFrameLowering::assignCalleeSavedSpillSlots. In this function we
divide callee saved registers on scalar and vector ones, based on
ScalableVector property of their frame indexes:
```
...
const auto &UnmanagedCSI = getUnmanagedCSI(*MF, CSI);
const auto &RVVCSI = getRVVCalleeSavedInfo(*MF, CSI);
...
```
But we assign ScalableVector property several lines below:
```
...
auto storeRegToStackSlot = [&](decltype(UnmanagedCSI) CSInfo) {
for (auto &CS : CSInfo) {
// Insert the spill to the stack frame.
Register Reg = CS.getReg();
const TargetRegisterClass *RC = TRI->getMinimalPhysRegClass(Reg);
TII.storeRegToStackSlot(MBB, MI, Reg, !MBB.isLiveIn(Reg),
CS.getFrameIdx(), RC, TRI, Register());
}
};
storeRegToStackSlot(UnmanagedCSI);
...
```
Due to it, list of RVV callee saved registers will always be empty.
Currently this problem doesn't appear, but if you slightly change the
code and, for example, put some instructions between scalar and vector
spills, the resulting code will be ill formed.
Fix#120190.
The hlfir.forall lowering code was not properly checking for forall
index reference in mask value computation before trying to hoist it: it
was only looking at the ops directly nested in the hlfir.forall_mask
region, but not the operation indirectly nested. This caused triggered
bogus hoisting in #120190 leading to undefined behavior (reference to
uinitialized data). The added regression test would die at compile time
with a dominance error.
Fix this by doing a deep walk of the region operation instead. Also
clean-up the region cloning to use without_terminator.
PR #112138 introduced initial support for dispatching to
multiple exit blocks via split middle blocks. This patch
fixes a few issues so that we can enable more tests to use
the new enable-early-exit-vectorization flag. Fixes are:
1. The code to bail out for any loop live-out values happens
too late. This is because collectUsersInExitBlocks ignores
induction variables, which get dealt with in fixupIVUsers.
I've moved the check much earlier in processLoop by looking
for outside users of loop-defined values.
2. We shouldn't yet be interleaving when vectorising loops
with uncountable early exits, since we've not added support
for this yet.
3. Similarly, we also shouldn't be creating vector epilogues.
4. Similarly, we shouldn't enable tail-folding.
5. The existing implementation doesn't yet support loops
that require scalar epilogues, although I plan to add that
as part of PR #88385.
6. The new split middle blocks weren't being added to the
parent loop.
This PR fixes https://github.com/llvm/llvm-project/issues/120078 and
improves/simplifies parsing of demangled function name that aims to
detect a tip for floating point decorations. The latter improvement
fixes also a complaint from `LLVM_USE_SANITIZER=Address`.
LLVM Premerge Checks is running on the new GCP cluster. Tracking its
metrics will allow us to determine the stability of the presubmit and
make sure the new infra is working as intended.
---------
Signed-off-by: Nathan Gauër <brioche@google.com>
Remove the 6th bullet point "Strive to improve security over time, for
example by adding additional testing, fuzzing and hardening after fixing
issues."
At the security group meeting on 2024-11-19 we discussed the role the
security group was performing in practice. We are in effect acting as a
security response group, dealing with issues raised via the process
given in the LLVM Security group page. We are not proactively adding
additional testing fuzzing and hardening. While this could be considered
an aspirational goal, it may give the implication that the LLVM Security
Group is handling or at worst guaranteeing security for the LLVM project
when in practice it is not.
Meeting notes:
https://discourse.llvm.org/t/llvm-security-group-public-sync-ups/62735/32
Rename LLVM Security Group to LLVM Security Response Group. Take the
opportunity to canonicalise security group and Security Group to LLVM
Security Response Group.
At the 2024-11-19 LLVM Security Group meeting [1] we discussed that in
practice the LLVM Security Group was performing an incident response
role, but it was not proactively adding additional testing, fuzzing and
hardening. We do not want projects that use LLVM to see the LLVM
Security Group as guaranteeing security for LLVM.
We decided that it would be useful to rename the group to LLVM Security
Response Group as that reflects the work that it is doing.
There may be a case for a proactive security group with a different
remit, but this is out of scope of this commit.
[1]
https://discourse.llvm.org/t/llvm-security-group-public-sync-ups/62735/32
This re-applies #117867 with a small fix that hopefully prevents build
bot failures. The fix is avoiding `dyn_cast` for the result of
`getOperation()`. Instead we can assign the result to `mlir::ModuleOp`
directly since the type of the operation is known statically (`OpT` in
`OperationPass`).
insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index
-> shuffle DestVec, (shuffle (fneg SrcVec), poison, SrcMask), Mask
Original combining left the combine between vectors of different lengths as a TODO.
The makes the priority of the Zfa patterns of the pseudos explicit.
Previously the priority only worked because instructions with
usesCustomInserter=1 have lower priority.
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.
It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.
Thus we should use `RegisterClassInfo::getRegPressureSetLimit` and
remove replicated code.
Separate from https://github.com/llvm/llvm-project/pull/118787
`RegisterClassInfo::getRegPressureSetLimit` is a wrapper of
`TargetRegisterInfo::getRegPressureSetLimit` with some logics to
adjust the limit by removing reserved registers.
It seems that we shouldn't use
`TargetRegisterInfo::getRegPressureSetLimit`
directly, just like the comment "This limit must be adjusted
dynamically for reserved registers" said.
Separate from https://github.com/llvm/llvm-project/pull/118787
Depends on #114525
Support `R_AARCH64_AUTH_GOT_ADR_PREL_LO21` and `R_AARCH64_AUTH_GOT_LD_PREL19`
GOT-generating relocations. A corresponding `RE_AARCH64_AUTH_GOT_PC` member
of `RelExpr` is added, which is an AUTH-specific variant of `R_GOT_PC`.
Split UnsupportedSchedZfhmin from UnsupportedSchedZfh.
UnsupportedSchedZfhmin inherits from UnsupportedSchedZfh and should be
used when no F16 is supported. UnsupportedSchedZfh can be used direclty
for CPUs that support Zfhmin but not Zfh.
Make UnsupportedSchedF inherit from both UnsupportedSchedD and
UnsupportedSchedZfhmin so that CPUs with no FP only need to include
UnsupportedSchedF. This required some minor refactorings to
RISCVSchedSyntacoreSCR345.td. I've also switched to inheritance instead of
using defm.
Use `MO.getOperandNo() == 0` instead of `IsMODef` so naming is clear for the store, since the store should treat its operand 0 like that even though it is not a def.The load should treat its operand 0 def in the same way.
This is a starting PR to implicitly map allocatable record fields.
This PR contains the following changes:
1. Re-purposes some of the utils used in `Lower/OpenMP.cpp` so that
these utils work on the `mlir::Value` level rather than the
`semantics::Symbol` level. This takes one step towards to enabling
MLIR passes to more easily do some lowering themselves (e.g. creating
`omp.map.bounds` ops for implicitely caputured data like this PR
does).
2. Adds support for implicitely capturing and mapping allocatable fields
in record types.
There is quite some distant to still cover to have full support for
this. I added a number of todos to guide further development.
Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>
Co-authored-by: Andrew Gozillon <andrew.gozillon@amd.com>
This commit makes the following changes:
1. Previously certain pipeline options could cause the options parser to
get stuck in an an infinite loop. An example is:
```
mlir-opt %s -verify-each=false
-pass-pipeline='builtin.module(func.func(test-options-super-pass{list={list=1,2},{list=3,4}}))''
```
In this example, the 'list' option of the `test-options-super-pass`
is itself a pass options specification (this capability was added in
https://github.com/llvm/llvm-project/issues/101118).
However, while the textual format allows `ListOption<int>` to be given
as `list=1,2,3`, it did not allow the same format for
`ListOption<T>` when T is a subclass of `PassOptions` without extra
enclosing `{....}`. Lack of enclosing `{...}` would cause the infinite
looping in the parser.
This change resolves the parser bug and also allows omitting the
outer `{...}` for `ListOption`-of-options.
2. Previously, if you specified a default list value for your
`ListOption`, e.g. `ListOption<int> opt{*this, "list",
llvm:🆑:list_init({1,2,3})}`,
it would be impossible to override that default value of `{1,2,3}` with
an *empty* list on the command line, since `my-pass{list=}` was not
allowed.
This was not allowed because of ambiguous handling of lists-of-strings
(no literal marker is currently required).
This change makes it explicit in the ListOption construction that we
would like to treat all ListOption as having a default value of "empty"
unless otherwise specified (e.g. using `llvm::list_init`).
It removes the requirement that lists are not printed if empty. Instead,
lists are not printed if they do not have their default value.
It is now clarified that the textual format
`my-pass{string-list=""}` or `my-pass{string-list={}}`
is interpreted as "empty list". This makes it imposssible to specify
that ListOption `string-list` should be a size-1 list containing the
empty string. However, `my-pass{string-list={"",""}}` *does* specify
a size-2 list containing the empty string. This behavior seems
preferable
to allow for overriding non-empty defaults as described above.
This patch works around:
compiler-rt/lib/tysan/../sanitizer_common/sanitizer_platform_limits_posix.h:604:3:
error: anonymous structs are a GNU extension
[-Werror,-Wgnu-anonymous-struct]
This diagnoses comparisons like `ptr + unsigned_index < ptr` and `ptr +
unsigned_index >= ptr`, which are always false/true because addition of
a pointer and an unsigned index cannot wrap (or the behavior is
undefined).
This warning is intended to help find broken bounds checks (which must
be implemented in terms of uintptr_t instead).
Fixes https://github.com/llvm/llvm-project/issues/120214.
When building SPEC CPU 2017 with RISC-V and EVL tail folding, this
assertion in VPTypeAnalysis would trigger during the transformation to
EVL recipes:
d8a0709b10/llvm/lib/Transforms/Vectorize/VPlanAnalysis.cpp (L135-L142)
It was caused by this recipe:
```
WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6>
```
Having its type inferred as i16, when ir<%add33> and ir<0> had inferred
types of i32 somehow.
The cause of this turned out to be because the VPTypeAnalysis cache was
getting clobbered: In this transform we were erasing recipes but keeping
around the same mapping from VPValue* to Type*. In the meantime, new
recipes would be created which would have the same address as the old
value. They would then incorrectly get the old erased VPValue*'s cached
type:
```
--- before ---
0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6>
0x600001ec5450: <badref> <- some value that was erased
--- after ---
0x600001ec5030: WIDEN ir<%mul21.neg> = vp.mul vp<%11>, ir<0>, vp<%6>
0x600001ec5450: WIDEN ir<%shr> = vp.or ir<%add33>, ir<0>, vp<%6> <- a new value that happens to have the same address
```
This fixes this by deferring the erasing of recipes till after the
transformation.
The test case might be a bit flakey since it just happens to have the
right conditions to recreate this. I tried to add an assert in
inferScalarType that every VPValue in the cache was valid, but couldn't
find a way of telling if a VPValue had been erased.
---------
Co-authored-by: Florian Hahn <flo@fhahn.com>
This fixes a crash that shows up when building SPEC CPU 2017 with EVL
tail folding on RISC-V.
A VPWidenCastRecipe doesn't always have an underlying value, and in the
case of this crash this happens whenever a widened cast is created via
truncateToMinimalBitwidths.
Fix this by just using the opcode stored in the recipe itself.
I think a similar issue exists with VPWidenIntrinsicRecipe and how it's
widened, but I haven't run into any crashes with it just yet.
There are cases (like in an upcoming patch to MLIR's `Property` class)
where the ? value is a useful null value. However, existing predicates
make ti difficult to test if the value in a record one is operating is ?
or not.
This commit adds the !initialized predicate, which is 1 on concrete,
non-? values and 0 on ?.
---------
Co-authored-by: Akshat Oke <Akshat.Oke@amd.com>
When Sel(Cmp) are in different integer type,
From: (K and N mean width, K < N; a and b are src operands.)
bN = Ext(bK)
cond = Cmp(aN, bN)
aK = Trunc aN
retK = Sel(cond, aK, bK)
To:
bN = Ext(bK)
cond = Cmp(aN, bN)
retN = Sel(cond, aN, bN)
retK = Trunc retN
Though Sel's operands width becomes larger, the benefit
of making type width in Sel the same as Cmp, is for combing
to max/min intrinsics, and also better performance for SIMD
instructions.
References of correctness: https://alive2.llvm.org/ce/z/Y4Kegmhttps://alive2.llvm.org/ce/z/qFtjtR
Reference of generated code comparision:
https://gcc.godbolt.org/z/o97svGvYMhttps://gcc.godbolt.org/z/59Ynj91ov
This moves the -f(no-)?sanitize-recover parsing into a generic
parseSanitizerArgs function, and then applies it to parse
-f(no-)?sanitize-recover and -f(no-)?sanitize-trap.
N.B. parseSanitizeTrapArgs does *not* remove non-TrappingSupported
arguments. This maintains the legacy behavior of '-fsanitize=undefined
-fsanitize-trap=undefined' (clang/test/Driver/fsanitize.c), which is
that vptr is not enabled at all (not even in recover mode) in order to
avoid the need for a ubsan runtime.
Optionally unconditionally hint allocations as cold or not cold during
the matching step if the percentage of bytes allocated is at least that
of the given threshold.