This makes Zicond and XVentanaCondOps use the same code path.
The instructions have identical semantics.
Reviewed By: wangpc
Differential Revision: https://reviews.llvm.org/D155391
This patch adds an interface, named AggregatedOpInterface, that decomposes
complex operations into simpler ones.
For now, make the interface specific to Linalg because although the concept
is general, the way to materialize it needs some maturing.
Use that interface with the softmax operator.
Differential Revision: https://reviews.llvm.org/D154363
Before falling back to CreateCoercedStore, detect a scalable vector
return being coerced to fixed vector. Handle it using a vector.extract
intrinsic without going through memory.
Reviewed By: c-rhodes
Differential Revision: https://reviews.llvm.org/D155495
Resolves#63917.
Also lets the compiler check for available GPR before hitting the stack.
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D155502
The vmv1r.v v8, v9 in the last block can be removed by late
copy propagation.
Reviewed By: wangpc
Differential Revision: https://reviews.llvm.org/D155527
In C++20, if Clang fails to perform constructor overload on a
RecordType, then Clang will try to perform parentesized aggregate
initialization. If that fails and the initialization was attempted as
part of a cast, then we should get the diagnostics from the failed
constructor overload attempt. However, we don't attempt constructor
overloading for arrays, so previously, if we try to diagnose an
overloaded cast for a parenthesized aggregate initialization of an
array, we crash. To fix this, we now exit tryDiagnoseOverloadedCast(...)
for failed parentesized list initialization if the destination type is
an array.
Fixes#63758
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D155523
This patch optimizes a pair of LDRSWpre and LDRSWui (or LDURSWi)
instructions into a single LDPSWpre instruction. This is a missing case
in D99272.
MIR test cases in D152564 are updated to verify the optimization.
Differential Revision: https://reviews.llvm.org/D152407
This patch adds MIR test cases that test merging an LDRSWpre-LDR
instruction pair into an LDPSWpre instruction. This optimization is
currently missing and will be added a subsequent patch (D152407), so all
test cases are no merge for now.
Differential Revision: https://reviews.llvm.org/D152564
The following pattern fails on recent GCC versions with -std=c++20 flag passed
and succeeds with -std=c++17. Such behavior is not observed on Clang 16.0.
```c++
template <typename T>
struct Foo {
Foo<T>(int a) {}
};
```
This patch removes template parameter from constructor in two occurences to
make the following command complete successfully:
bazel build -c fastbuild --cxxopt=-std=c++20 --host_cxxopt=-std=c++20 @llvm-project//llvm/...
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D154782
Summary:
The changes in D155173 caused build failures on some configruations due
to the use of `Range` which conflicts with other definitions. Explicitly
use `clangd::Range`.
Currently we keep an internal buffer of device memory that is used to
indicate ownership of a port. Since we only use this as a single bit we
can simply turn this into a bitfield. I did this manually rather than
having a separate type as we need very special handling of the masks
used to interact with the locks.
Reviewed By: JonChesterfield
Differential Revision: https://reviews.llvm.org/D155511
We currently don't extract vector elements from multi-use build vectors unless TLI.aggressivelyPreferBuildVectorSources accepts them, which seems a little extreme for constant build vectors (especially as under some cases ComputeKnownBits will indirectly extract the data for us).
This is causing a few regressions in some upcoming SimplifyDemandedBits work I'm looking at, all of which just need to know that the element is zero, so I've tweaked the fold to accept zero elements as well, which will typically fold very easily.
Differential Revision: https://reviews.llvm.org/D155582
This functionality has been replaced by TypeCasters (see D151840)
depends on D154468
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D154469
This patch sets 'polymorphic' attribute of hlfir::ExprType when
the value is created from a polymorphic entity.
Memoization of such ExprType involves creating a mutable descriptor
on the stack, which is initialized (as a null box) and passed to
AllocatableApplyMold with the mold being the entity from which
the ExprType value is being created.
This patch fixes "creating polymorphic temporary" TODO and also
several cases of "'fir.convert' op invalid type conversion" error.
Reviewed By: tblah
Differential Revision: https://reviews.llvm.org/D155541
A previous patch made this cause an error on the GPU. We have not yet
dedicated time towards an optimial implementaiton there but we do not
want it to cause an error. We simply use the fallback routines.
Differential Revision: https://reviews.llvm.org/D155615
The forcing of the linker for a new module was moved from file
clang-tidy/tools/ClangTidyModule.cpp to
clang-tidy/ClangTidyForceLinker.h.
Reviewed By: PiotrZSL
Differential Revision: https://reviews.llvm.org/D76477
The operator++, operator++(int), operator--, and operator--(int) need to
change the month to a valid value. The wording is specified in terms of
operator+(const month& x, const months& y) noexcept;
which has the correct behavior. The aforementioned operators instead
used ++/-- on the internal value direction, resulting in incorrect
behaviour.
As a drive-by improve the unit tests:
- use the typical constexpr test method
- test whether the month is valid after the operations
- format the tests
Fixes: https://llvm.org/PR63912
Reviewed By: #libc, ldionne
Differential Revision: https://reviews.llvm.org/D155504
- No longer store the diagnostic fixits in the clangdLSPServer
- When propagating the fixit via the code action, we use the Diag
information stored in the ParsedAST (in clangdServer.cpp)
Differential Revision: https://reviews.llvm.org/D155173
This reverts commit b1d0bc0f4395c69097bc11b6ba8f821f621272a9.
Builds with expensive checks show that 'sp' isn't a valid register
in ADDXrr - an object file built without exprnsive checks enabled
disassembles as "add x15, xzr, x16", instead of the intended
"add x15, sp, x16".
A declare directive is used to specify the creation of a visible device
copy of a variable for the duration of the implicit data region as it
relates to the scope in which the variable is declared.
In order to support this, the following new operations were added:
1) `acc.global_ctor` and `acc.global_dtor`. These are used whenever the
declare directive applies to a global.
2) `acc.declare_enter` and `acc.declare_exit`. These operations are
modeled similarly to `acc.enter_data` and `acc.exit_data`. The reason
they are not modeled like `acc.data` is so that these operations can be
used both for globals and regions like functions.
3) `acc.declare_device_resident` and `acc.declare_link`. These
operations are modeled in a manner consistent with previously defined
data entry operation model.
The `acc.getdeviceptr` was generalized so that it can be used with
acc.declare_exit.
Reviewed By: clementval, vzakhari
Differential Revision: https://reviews.llvm.org/D155322
Currently when inlining, any alias scope information previously attached to the call op is lost. This leads to a loss of information that could be used by alias analysis to determine that two memory access operations do not alias.
This patch fixes this issue by also taking any alias scopes of the call operation into account. These can then simply be appended onto any inlined operations.
This is analogous to the following code in LLVM: 1768c4597e/llvm/lib/Transforms/Utils/InlineFunction.cpp (L940)
Differential Revision: https://reviews.llvm.org/D155595
This patch implements domain and range restriction for PresburgerRelation
Reviewed By: Groverkss
Differential Revision: https://reviews.llvm.org/D154798
Instead of asserting merely that the flow condition doesn't imply that a variable is true, make the stronger assertion that the flow condition implies that the variable is false.
Reviewed By: ymandel, xazax.hun
Differential Revision: https://reviews.llvm.org/D155067
Need to account reshuffling, required for the reused elements in the
buildvector nodes, which are copies (perfect match) of other nodes, but
include reused elements.
Differential Revision: https://reviews.llvm.org/D149966
In most places where TransferImpOps is currently used we just have one
machine instruction, so it's doing the same thing as copyImplicitOps
anyway. In those cases where we have more than one machine
instruction the destination is written to in each instruction so any
implicit defs should appear on all of them (and we shouldn't see any
implicit refs as these pseudo-instruction don't have any register
inputs), meaning the current use of TransferImpOps is incorrect and
we should be using copyImplicitOps on all of the generated
instructions.
Differential Revision: https://reviews.llvm.org/D155301
Replace lengthy `0b...` binary form with a unified 32-bit hexadecimal
representation for opcode. This reduces complexity when dealing with
opcode discontinuities.
Unfortunately we can't use the standard splat_vector and vnot PatFrags because
they are preprocessed to vmv.v.x's, so we need to define helpers to catch
those. We can't use SplatPat either because we need to nest another fragment
inside of it.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D155433
This transform looks for suitable vector transfers from global memory to shared memory and converts them to async device copies.
Differential Revision: https://reviews.llvm.org/D155569
This is a follow up on D154800 and D154770 to make the code structure more principled and avoid too many nested #ifdef/#endif.
Reviewed By: courbet
Differential Revision: https://reviews.llvm.org/D155515