As added in #124274, CPUs in this range can suffer from performance
issues with ldapur. As the gain from ldar->ldapr is expected to be
greater than the minor gain from ldapr->ldapur, this opts to avoid the
instruction under the default -mcpu=generic when the -march is less that
armv8.8 / armv9.3.
I renamed AArch64Subtarget::Others to AArch64Subtarget::Generic to be
clearer what it means.
(cherry picked from commit 6424abcd6c9c6aa8171c79d0fe0369d3a10da3d5)
These features FEAT_FAMINMAX, FEAT_LUT and FEAT_FP8 depends on
FEAT_NEON.
Update dependency from FEAT_FP8DOT4 and FEAT_FP8DOT2. Now depends
indirectly on FEAT_NEON through FEAT_FP8
LLVM has started to emit AArch64 build attributes sections called
.ARM.attributes. LLD does not yet have support for these so they are
accumulating in the ELF output. As the first part of that support
discard all the .ARM.attributes sections. This can be built upon by the
full implementation in LLD.
The build attributes specification only defines build attributes for
relocatable objects. The intention for LLD is that files of type ET_EXEC
and ET_SHARED will not have a build attributes in the output. A
relocatable link with -r will need a merged build attributes, but until
the merge is implemented it is better to discard.
(cherry picked from commit ba476d0b83dc8a4bbf066dc02a0f73ded27114f0)
Summary:
Pretty dumb mistake of me, forgot that this is run per-device and
per-plugin, which fell through the cracks with my testing because I have
two GPUs that use different plugins.
(cherry picked from commit 7a8779422dad058f11cd473d409f42e32859788d)
We can emit diagnostics while parsing warning-suppression-mapping, make
sure command line flags take affect when emitting those.
(cherry picked from commit ecb016a87d89aed36b8f5d8102e15d8eb0e57108)
Per to discussions in #125324, most participants are opposed to this
optimization. So remove the combination to address the concerns.
Fixes#125324
(cherry picked from commit 8c222c122f1a8edb1be96e482511ad547f7db7b3)
Prior workflow runs were not being cancelled when the pull request was
closed, and I think this was why. Also, there is no advantage to having
the definitions at the job level.
(cherry picked from commit 6e5988863177e1d53e7a7abb7a3db2b85376f0f5)
This is an external tool, so I don't think there is an expectation that
it has to be in the LLVM tools bindir. It may also be in the default
system bindir (which is not necessarily the same).
(cherry picked from commit 26ecddb05d13c101ccd840a6710eb5f8b82de841)
Fixes issue caused by 1930524bbde3cd26ff527bbdb5e1f937f484edd6
Unused variable UsesMask in LoopVectorize.cpp
(cherry picked from commit 3872e55758a5de035c032a975f244302c3ddacc3)
The legacy and vplan cost models did not agree because
VPWidenCallRecipe::computeCost only calculates the cost of the
call instruction, whereas
LoopVectorizationCostModel::setVectorizedCallDecision in some
cases adds on the cost of a synthesised mask argument. However,
this mask is always 'splat(i1 true)' which should be hoisted out
of the loop during codegen. In order to synchronise the two cost
models I have two options:
1) Also add the cost of the splat to the vplan model, or
2) Remove the cost of the splat from the legacy model.
I chose 2) because I feel this more closely represents what the
final code will look like. There is an argument that we should
take account of such broadcast costs in the preheader when
deciding if it's profitable to vectorise a loop, however there
isn't currently a mechanism to do this. We currently only take
account of the runtime checks when assessing profitability and
what the minimum trip count should be. However, I don't believe
this work needs doing as part of this PR.
(cherry picked from commit 1930524bbde3cd26ff527bbdb5e1f937f484edd6)
Also link with libexecinfo on FreeBSD, NetBSD, OpenBSD and DragonFly
for the backtrace functions.
(cherry picked from commit d1de75acea0da55316cd7827563e064105868f0f)
This gets rid of some extra IO from driver startup, and possiblity of
emitting warnings twice.
(cherry picked from commit df22bbe2beb57687c76402bc0cfdf7901a31cf29)
The ReferenceLocs are not enabled by default (they are used by the
tablegen lsp server), and as such always empty, but still allocate
inline storage for the SmallVector. Disabling it saves about 200MB on
RISCVGenGlobalISel.inc.
(The equivalent field in Record already disables inline storage.)
(cherry picked from commit c640f97ccf723e64ff24af225cb995c905538406)
MatchTableRecord stores a 64-bit RawValue. However, this field is only
needed by a small part of the code (jump table generation).
Create a separate RecordAndValue structure that is used in just the
necessary places.
Based on massif, this reduces memory usage on RISCVGenGlobalISel.inc by
about 100MB (to 2.15GB).
(cherry picked from commit e2301d674976b84ba505065a9702f3376e05bc43)
The previous tested TZDB did not contain %z for the rule letters. The
usage of %z in TZDB 2024b revealed a bug in the implementation. The
patch fixes it and has been locally tested with TZDB 2024b.
Fixes#108957
(cherry picked from commit a27f3b2bb137001735949549354aff89dbf227f4)
Add a CMake flag (LLVM_BUILD_TELEMETRY) to disable building the
telemetry framework. The flag being enabled does *not* mean that
telemetry is being collected, it merely means we're building the generic
telemetry framework. Hence the flag is enabled by default.
Motivated by this Discourse thread:
https://discourse.llvm.org/t/how-to-disable-building-llvm-clang-telemetry/84305
(cherry picked from commit bac62ee5b473e70981a6bd9759ec316315fca07d)
Consider the following pattern:
```
%cmp = fcmp <pred> double %x, 0.000000e+00
%negX = fneg <fmf> double %x
%sel = select i1 %cmp, double %x, double %negX
```
We cannot propagate ninf from fneg to select since `%negX` may not be
chosen. Similarly, we cannot propagate nnan unless `%negX` is guaranteed
to be selected when `%x` is NaN.
This patch also propagates nnan/ninf from fcmp to avoid regression in
`PhaseOrdering/generate-fabs.ll`.
Alive2: https://alive2.llvm.org/ce/z/t6U-tA
Closes https://github.com/llvm/llvm-project/issues/121430 and
https://github.com/llvm/llvm-project/issues/113989.
(cherry picked from commit 3ec6a6b85aed838b7d56bd6843cad52e822b9111)
Per the feedback we got, we’d like to switch m[no-]avx10.2 to alias of
512 bit options and disable m[no-]avx10.1 due to they were alias of 256
bit options.
We also change -mno-avx10.[1,2]-512 to alias of 256 bit options to
disable both 256 and 512 instructions.
Cherry-pick from
9ebfee9d68
The APInt constructor asserts if bits are set past the size of the APInt
unless it is signed. This currently fails on RV32 because more than XLen
bits are set.
(cherry picked from commit 0d7ee520d3a9b8997adf8eaaa22b33db9659d94e)
This seems consistent with the documentation, which claims it:
```
/// Looks through the Decl's underlying type to extract a FunctionType
/// when possible. Will return null if the type underlying the Decl does not
/// have a FunctionType.
const FunctionType *getFunctionType(bool BlocksToo = true) const;
```
Note: This patch rewords this doc comment to clarify it includes various
function pointer types.
Without this, attaching attributes (which use `HasFunctionProto`) to
member function pointers errors with:
```
error: '<attr>' only applies to non-K&R-style functions
```
...which does not really make sense, since member functions are not K&C
functions.
With this change the Arm SME TypeAttrs work correctly on member function
pointers.
Note, however, that not all attributes work correctly when applied to
function pointers or member function pointers. For example,
`alloc_align` crashes when applied to a function pointer (on truck):
https://godbolt.org/z/YvMhnhKfx (as it only expects a `FunctionDecl` not
a `ParmVarDecl`). The same crash applies to member function pointers
(for the same reason).
(cherry picked from commit 692c9b210728323ac499a402ee6eb901f35856f2)
Fixes building the backtrace support on FreeBSD/NetBSD/OpenBSD/DragonFly and musl
libc with libexecinfo.
(cherry picked from commit cb2598dda1aae5096a77bc8a9f6679ca1b350e5e)
The suppressions mechanism doesn't work reliably in optimized builds,
which turns out to be a known issue (see b87543c704724 / svn r308908).
Disable this test, as it is also testing a feature (alloc/dealloc
mismatch) that is disabled by default on Darwin anyway.
rdar://143830493
(cherry picked from commit 4985804c0608a83f6ab017137c3d3d4f02827774)
Add %env_asan_opts=alloc_dealloc_mismatch=1 since it is disabled by
default.
rdar://143830493
(cherry picked from commit f0d05b099dafda89df4c971b64b2051c33db5da1)
Use the test compiler ID to verify whether tests can be run rather than
the host compiler. This makes it possible to run tests (with Clang)
while the library itself was built with GCC.
(cherry picked from commit 689ef5fda0ab07dfc452cb16d3646d53e612cb75)
Use `gnu::format` attribute only when compiling with Clang, as using it
against variadic template functions is a Clang extension and is not
supported by GCC.
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77958Fixes#119069
(cherry picked from commit 359a9131704277bce0f806de31ac887e68a66902)
We don't want to allow partial reductions resulting in a vscale x 1 type
as we can't lower it in the backend.
(cherry picked from commit c7995a6905f2320f280013454676f992a8c6f89f)
If we have +sme but not +sve, we would not set vscale_range on
functions. It should be valid to apply it with the same range with just
+sme, which can help mitigate some performance regressions in cases such
as scalable vector bitcasts (https://godbolt.org/z/exhe4jd8d).
(cherry picked from commit 9f1c825fb62319b94ac9604f733afd59e9eb461b)
This way we don't need to duplicate the list of supported targets in the
release-tasks workflow.
(cherry picked from commit d194c6b9a7fdda7a61abcd6bfe39ab465bf0cc87)
This allows using the full 64 bit range for file offsets.
This should fix the issue reported downstream at
https://github.com/mstorsjo/llvm-mingw/issues/462.
(cherry picked from commit 86e20b00c313e96db3b69d440bfb2ca9063f08f0)
A C++ lambda does not inherit attributes from the parent function. So
the SME builtin diagnostics should look at the lambda's attributes, not
the parent function's.
The fix is very simple and just adds the missing "AllowLambda" flag to
the function decl lookups.
(cherry picked from commit 2b7509e9885c9a5656bb3c201421e146a21fb88e)
Microsoft allows the 'inline' specifier on a typedef of a function type
in C modes. This is used by a system header (ufxclient.h), so instead of
giving a hard error, we diagnose with a warning. C++ mode and non-
Microsoft compatibility modes are not impacted.
Fixes https://github.com/llvm/llvm-project/issues/124869
(cherry picked from commit ef91caec2cf313624829114802cff92ae682e550)
When using PAuthLR, the PAUTH_PROLOGUE expands into a sequence of
instructions which takes the address of one of those instructions, and
uses that address to compute the return address signature. If this is
duplicated, there will be two different addresses used in calculating
the signature, so the epilogue will only be correct for (at most) one of
them.
This change also restricts code generation when using v8.3-A return
address signing, without PAuthLR. This isn't strictly needed, as
duplicating the prologue there would be valid. We could fix this by
having two copies of PAUTH_PROLOGUE, with and without isNotDuplicable,
but I don't think it's worth adding the extra complexity to a security
feature for that.
(cherry picked from commit 36b3c43524c8ca86a5050496b8773f07c5ccddff)
It turns out we weren't handling one case: the value-initialization of a
field inside a struct.
I'm not sure why this falls under `IK_Direct` rather than `IK_Value` in
Clang, but it seems to work.
(cherry picked from commit 20fd7df0b847bb46aac2f0b5b71d242220027cbc)
This PR replaces the deleted ext with the promoted value in `AddrMode`.
Fixes#70938.
(cherry picked from commit 3c6aa04cf4dee65113e2a780b9f90b36bb4c4e04)
The commits were gathered using:
```sh
git log --reverse --oneline llvmorg-20-init..llvm/main \
clang/{lib/StaticAnalyzer,include/clang/StaticAnalyzer} | grep -v NFC | grep -v OpenACC | grep -v -i revert
```
After this I categorized the changes and dropped the less user-facing
commits.
FYI, I also ignored Webkit changes because I assue it's fairly specific
for them, and they likely already know what they ship xD.
I used the `LLVM_ENABLE_SPHINX=ON` and `LLVM_ENABLE_DOXYGEN=ON` cmake
options to enable the `docs-clang-html` build target, which generates
the html into `build/tools/clang/docs/html/ReleaseNotes.html` of which I
attach the screenshots to let you judge if it looks all good or not.
I also used Grammarly this time to check for blatant typos.
---------
Co-authored-by: Donát Nagy <donat.nagy@ericsson.com>