526954 Commits

Author SHA1 Message Date
Razvan Lupusoru
1c583c19bb
[acc][mlir] Add functionality for categorizing OpenACC variable types (#126167)
OpenACC specification describes the following type categories: scalar,
array, composite, and aggregate (which includes arrays, composites, and
others such as Fortran pointer/allocatable).

Decision for how to do implicit mapping is dependent on a variable's
category. Since acc dialect's only means of distinguishing between types
is through the interfaces attached, add API to be able to get the type
category.

In addition to defining the new API, attempt to provide a base
implementation for memref which matches what OpenACC spec describes.
2025-02-10 08:03:38 -08:00
Nico Weber
308d28667c
[llvm][docs] Tweak backporting instructions a bit (#126519)
* Drop ".Z" in milestone name since we've been doing X.Y releases
instead of X.Y.Z releases since LLVM 18

* Add "LLVM" prefix since that's what release milestones are named

* Use a numbered list to make it clearer that there are two steps
needed, and add some more details to the first step
2025-02-10 10:58:16 -05:00
Nico Weber
783275eb7b
[clang] Handle f(no-)strict-overflow, f(no-)wrapv, f(no-)wrapv-pointer like gcc (#126524)
We now process all 6 options left-to-right and pick whatever is active
at the end.

Fixes #124868.
2025-02-10 10:57:22 -05:00
Kazu Hirata
6228379a6c
[llvm-profgen] Avoid repeated hash lookups (NFC) (#126467) 2025-02-10 07:50:57 -08:00
Kazu Hirata
2f88672414
[Coroutines] Avoid repeated hash lookups (NFC) (#126466) 2025-02-10 07:50:32 -08:00
Kazu Hirata
de563951b7
[Analysis] Avoid repeated hash lookups (NFC) (#126465) 2025-02-10 07:50:12 -08:00
Kazu Hirata
ba9810e803
[TableGen] Avoid repeated hash lookups (NFC) (#126464) 2025-02-10 07:49:42 -08:00
Kazu Hirata
eaedfc0e52
[Lex] Avoid repeated hash lookups (NFC) (#126462) 2025-02-10 07:49:17 -08:00
Kazu Hirata
280d2a3035
[AST] Avoid repeated hash lookups (NFC) (#126461) 2025-02-10 07:48:57 -08:00
Luke Lau
36530414e3
[RISCV][VLOPT] Add support for Vector Fixed-Point Arithmetic Instructions (#126483)
This patch adds the remaining support for fixed-point arithmetic
instructions (we previously had support for averaging adds and
subtracts).

For saturating adds/subs/multiplies/clips, we can't change `vl` if
`vxsat` is used, since changing `vl` may change its value. So this patch
checks to see if it's dead before considering it a candidate.
2025-02-10 23:43:16 +08:00
Ramkumar Ramachandra
3019e49ebf
SCEV: thread samesign in isBasicBlockEntryGuardedByCond (NFC) (#125840)
isBasicBlockEntryGuardedByCond inadvertedenly drops samesign information
when calling ICmpInst::getNonStrictPredicate. Fix this.
2025-02-10 14:47:13 +00:00
zhijian lin
ec60e1d8e2
[XCOFF][llvm-readobj] Print symbol value kind when dumping symbols (#125861)
llvm-readobj print out symbol value name for xcoff symbol table.

reference doc:
https://www.ibm.com/docs/en/aix/7.2?topic=formats-xcoff-object-file-format#XCOFF__yaa3i18fjbau
2025-02-10 09:37:04 -05:00
Timm Baeder
1aa48af1f8
[clang][bytecode][NFC] Discard all CastExprs uniformly (#126511) 2025-02-10 15:11:01 +01:00
Haojian Wu
4d2a1bf563
[clang] CTAD alias: Respect explicit deduction guides defined after the first use of the alias template. (#125478)
Fixes #103016

This is the last missing piece for the C++20 CTAD alias feature. No
release note being added in this PR yet, I will send out a follow-up
patch to mark this feature done.

(Since the release 20 branch is cut, I think we should target on
clang21).
2025-02-10 15:05:49 +01:00
Nico Weber
71adb05402
[clang] Expose -f(no-)strict-overflow as a clang-cl option (#126512)
Also move the -fno-strict-overflow option definition next to the
-fstrict-overflow one while here.

Also add test coverage for f(no-)wrapv-pointer being a clang-cl option.
2025-02-10 09:00:31 -05:00
Ramkumar Ramachandra
c6b13a2871
Revert "SCEV: teach isImpliedViaOperations about samesign" (#126506)
The commit f5d24e6c is buggy, and following miscompiles have been
reported: #126409 and
https://github.com/llvm/llvm-project/pull/124270#issuecomment-2647222903

Revert it while we investigate.
2025-02-10 13:31:18 +00:00
Luke Lau
af2a228e0b
[RISCV][VLOPT] Fix passthru operand info for mixed-width instructions (#126504)
After #124066 we started allowing users that are passthrus. However for
widening/narrowing instructions we were returning the wrong operand info
for passthru operands since it originally assumed the operand would
never be a passthru. This fixes it by handling it in IsMODef.
2025-02-10 21:30:05 +08:00
Timm Baeder
199c791a1d
[clang][bytecode] Support partial initializers for CXXNewExprs (#126494)
For `new A[N]{1,2,3}`, we need to allocate N elements of type A, and
initialize the first three with the given InitListExpr elements.
However, if N is larger than 3, we need to initialize the remaining
elements with the InitListExpr array filler.

Similarly, for `new A[N];`, we need to initilize all fields with the
constructor of A. The initializer type is a CXXConstructExpr of
IncompleteArrayType in this case, which we can't generally handle.
2025-02-10 14:28:40 +01:00
Shilei Tian
bde8ce6a5c
[AMDGPU] Only run AMDGPUPrintfRuntimeBindingPass at non-prelink phase (#125162) 2025-02-10 08:24:50 -05:00
Simon Pilgrim
121e6abefd [X86] IsElementEquivalent - pull out repeated getValueType calls. NFC. 2025-02-10 13:23:01 +00:00
Mikhail R. Gadelha
83fa117f76
[RISCV] Add cost model for fma (#126076)
This change builds on PR #125683, which added a cost model for fmuladd.

To ensure completeness, this patch extends the cost model to also cover fma, using the same costing approach as fmuladd.

I plan to send a follow-up patch that includes the cost model vp_fma and vp_fmuladd, and their tests.
2025-02-10 10:11:28 -03:00
Donát Nagy
729416e586
[analyzer][NFC] Remove "V2" from ArrayBoundCheckerV2.cpp (#126094)
Previously commit 6e17ed9b04e5523cc910bf171c3122dcc64b86db deleted the
obsolete checker `alpha.security.ArrayBound` which was implemented in
`ArrayBoundChecker.cpp` and renamed the checker
`alpha.security.ArrayBoundV2` to `security.ArrayBound`.

This commit concludes that consolidation by renaming the source file
`ArrayBoundCheckerV2.cpp` to `ArrayBoundChecker.cpp` (which was "freed
up" by the previous commit).
2025-02-10 13:25:07 +01:00
Rolf Morel
f796bc622a
[MLIR][Linalg] Expose linalg.matmul and linalg.contract via Python API (#126377)
Now that linalg.matmul is in tablegen, "hand write" the Python wrapper
that OpDSL used to derive. Similarly, add a Python wrapper for the new
linalg.contract op.

Required following misc. fixes:
1) make linalg.matmul's parsing and printing consistent w.r.t. whether
indexing_maps occurs before or after operands, i.e. per the tests cases
it comes _before_.
2) tablegen for linalg.contract did not state it accepted an optional
cast attr.
3) In ODS's C++-generating code, expand partial support for `$_builder`
access in `Attr::defaultValue` to full support. This enables access to
the current `MlirContext` when constructing the default value (as is
required when the default value consists of affine maps).
2025-02-10 12:05:13 +00:00
Luke Lau
771f6b9f43
[RISCV][VLOPT] Add support for Widening Floating-Point Fused Multiply-Add Instructions (#126485)
We already had getOperandInfo support, so this marks the instructions as
supported in isCandidate. It also adds support for vfwmaccbf16.v{v,f}
from zvfbfwma
2025-02-10 19:55:22 +08:00
Luke Lau
71ee257a1d [RISCV][VLOPT] Precommit tests for opt info on passthrus. NFC
Currently we are returning the wrong operand info for passthru
operands.
2025-02-10 19:47:04 +08:00
ZhaoQi
0b5c318127
[LoongArch] Merge base and offset for tls-le code sequence (#122999)
Adapt the merge base offset pass to optimize the tls-le code sequence.
2025-02-10 19:44:24 +08:00
Ramkumar Ramachandra
738cf5acc6
InstSimplify: improve computePointerICmp (NFC) (#126255)
The comment about inbounds protecting only against unsigned wrapping is
incorrect: it also protects against signed wrapping, but the issue is
that it could cross the sign boundary.
2025-02-10 11:42:06 +00:00
wldfngrs
7ee56b9afc
[libc][math][c23] Add asinf16() function (#124212)
Co-authored-by: OverMighty <its.overmighty@gmail.com>
2025-02-10 12:38:55 +01:00
Simon Pilgrim
65a92544f7 [X86] canonicalizeShuffleWithOp - pull out repeated flag settings to IsMergeableWithShuffle lambda. NFC.
Prep work before tweaking the flags in a future patch.
2025-02-10 11:31:58 +00:00
Simon Pilgrim
d9183fd96e
[X86] LowerSelect - use BLENDV for scalar selection on all SSE41+ targets (#125853)
When we first began (2015) to lower f32/f64 selects to
X86ISD::BLENDV(scalar_to_vector(),scalar_to_vector(),scalar_to_vector()),
we limited it to AVX targets to avoid issues with SSE41's xmm0
constraint for the condition mask.

Since then we've seen general improvements in TwoAddressInstruction and
better handling of condition commutation for X86ISD::BLENDV nodes, which
should address many of the original concerns of using SSE41 BLENDVPD/S.
In most cases we will replace 3 logic instruction with the BLENDV node
and (up to 3) additional moves. Although the BLENDV is often more
expensive on original SSE41 targets, this should still be an improvement
in a majority of cases.

We also have no equivalent restrictions for SSE41 for v2f64/v4f32 vector
selection.

Fixes #105807
2025-02-10 11:24:04 +00:00
David Spickett
f845497f3b
[llvm][Docs] Explain how to handle excessive formatting changes (#126239)
Based on some feedback in Discord about a PR where a reviewer asked the
author to move the formatting changes to a new PR, which appears to
contradict the current form of this document.

I've added an explanation here, before the point where the author would
be committing any of the formatting changes.

There are other ways this can go, for example some projects don't want
the churn of formatting, or you can pre-emptively send a formatting PR,
but I don't think enumerating them all here will help the audience for
this text.

So I've recomended one path that will start them off well, and can
branch off if the reviewers make requests.
2025-02-10 10:32:45 +00:00
Aniket Lal
cab893ab8e
[Clang][Driver][HIP] Do not specify explicit target cpu in host compilation run line (#126488)
This PR fixes the post merge check fails from PR
https://github.com/llvm/llvm-project/pull/125646

Co-authored-by: anikelal <anikelal@amd.com>
2025-02-10 10:24:13 +00:00
Fraser Cormack
4dec3909e9
[libclc] Have all targets build all CLC functions (#124779)
This removes all remaining SPIR-V workarounds for CLC functions, in an
effort to streamline the CLC implementation and prevent further issues
that #124614 had to fix. This commit fixes the same issue for the SPIR-V
targets.

Target-specific CLC implementations can and will exist, but for now
they're all identical and so the target-specific SOURCES files have been
removed. Target implementations now always include the 'generic' CLC
directory, meaning we can avoid unnecessary duplication of SOURCES
listings.
2025-02-10 10:19:22 +00:00
Jan Patrick Lehr
6fd99de318
Revert "[LinkerWrapper] Clean up options after proper forwarding" (#126495)
Reverts llvm/llvm-project#126297

Broken buildbots
https://lab.llvm.org/staging/#/builders/105/builds/15554
https://lab.llvm.org/buildbot/#/builders/30/builds/15490

Error is
```
# .---command stderr------------
# | FileCheck error: '/work/janplehr/git/llvm-project/bot-tester-builds/cmakecachebuild/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/bug51781.c.tmp.custom' is empty.
# | FileCheck command line:  /home/janplehr/git/llvm-project/bot-tester-builds/cmakecachebuild/./bin/FileCheck /work/janplehr/git/llvm-project/offload/test/offloading/bug51781.c -check-prefix=CUSTOM -input-file=/work/janplehr/git/llvm-project/bot-tester-builds/cmakecachebuild/runtimes/runtimes-bins/offload/test/amdgcn-amd-amdhsa/offloading/Output/bug51781.c.tmp.custom
```
The file is empty, while the `CUSTOM` check-target expects to find
```
// CUSTOM: Rewriting generic-mode kernel with a customized state machine.
```
2025-02-10 10:58:56 +01:00
Amir Bishara
7090dff6fe
[mlir][scf]: Add value bound for the computed upper bound of for loop (#126426)
Add additional bound for the induction variable of the `scf.for` such
that:
`%iv <= %lower_bound + (%trip_count - 1) * step`
2025-02-10 11:35:02 +02:00
Nikita Popov
2d31a12dbe
[DSE] Don't use initializes on byval argument (#126259)
There are two ways we can fix this problem, depending on how the
semantics of byval and initializes should interact:

* Don't infer initializes on byval arguments. initializes on byval
refers to the original caller memory (or having both attributes is made
a verifier error).
* Infer initializes on byval, but don't use it in DSE. initializes on
byval refers to the callee copy. This matches the semantics of readonly
on byval. This is slightly more powerful, for example, we could do a
backend optimization where byval + initializes will allocate the full
size of byval on the stack but not copy over the parts covered by
initializes.

I went with the second variant here, skipping byval + initializes in DSE
(FunctionAttrs already doesn't propagate initializes past byval). I'm
open to going in the other direction though.

Fixes https://github.com/llvm/llvm-project/issues/126181.
2025-02-10 10:34:03 +01:00
Cullen Rhodes
317a644ae6
[SDAG] Precommit tests for #126207 (NFC) (#126208)
Add missing test coverage for codepaths touched by #126207.
2025-02-10 09:13:02 +00:00
David Green
b3e74e307f
[AArch64] Add SUBHN patterns for xor variant (#126100)
`xor x, -1` can be treated as `sub -1, x`, add patterns for generating
subhn as opposed to a not.

Fixes #123999
2025-02-10 09:09:14 +00:00
Nikita Popov
7aed53eb19
[ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge() (#126236)
The code already guards against values coming from a previous iteration
using properlyDominates(). However, addrecs are considered to properly
dominate the loop they are defined in.

Handle this special case separately, by checking for expressions that
have computable loop evolution (this should cover cases like a zext of
an addrec as well).

I considered changing the definition of properlyDominates() instead, but
decided against it. The current definition is useful in other context,
e.g. when deciding whether an expression is safe to expand in a given
block.

Fixes https://github.com/llvm/llvm-project/issues/126012.
2025-02-10 10:07:21 +01:00
Brad Smith
52a02b6d1e
[openmp] Fix for 32-bit PowerPC (#126412) 2025-02-10 04:04:26 -05:00
ZhaoQi
91682da438
[LoongArch] Pre-commit tests for tls-le merge base offset. NFC (#122998)
Similar to tests in `merge-base-offset.ll`, except for tests of
blockaddress.

A later commit will optimize this.
2025-02-10 16:40:07 +08:00
Aniket Lal
d9cdf27834
[Driver][HIP] Do not pass -dependency-file flag for HIP Device offloading (#125646)
When we launch hipcc with multiple offload architectures along with -MF
dep_file flag, the clang compilation invocations for host and device
offloads write to the same dep_file, and can lead to collision during
file IO operations. This can typically happen during large workloads.
This commit provides a fix to generate dep_file only in host
compilation.

---------

Co-authored-by: anikelal <anikelal@amd.com>
2025-02-10 13:57:52 +05:30
Ricardo Jesus
5f84b6edd9
[AArch64] Add MATCH loops to LoopIdiomVectorizePass (#101976)
This patch adds a new loop to LoopIdiomVectorizePass, enabling it to
recognise and vectorise loops such as:
```cpp
template<class InputIt, class ForwardIt>
InputIt find_first_of(InputIt first, InputIt last,
                      ForwardIt s_first, ForwardIt s_last)
{
  for (; first != last; ++first)
    for (ForwardIt it = s_first; it != s_last; ++it)
      if (*first == *it)
        return first;
  return last;
}
```

These loops match the C++ standard library function `std::find_first_of`.
2025-02-10 08:23:34 +00:00
Mehdi Amini
67b7a2590f
Revert "[mlir] Python: Parse ModuleOp from file path" (#126482)
Reverts llvm/llvm-project#125736

The gcc7 Bot is broken at the moment.
2025-02-10 09:09:58 +01:00
David Stuttard
30e7c10146
[AMDGPU] - Fix non-deterministic compile issue (#126271)
4ce1f9079d4d3 [AMDGPU] Allow rematerialization of instructions with
virtual register uses (#124327)
made changes that require an ordered traversal of a DenseMap. Changing
it to MapVector which
respects insertion order.
2025-02-10 07:58:02 +00:00
Piotr Fusik
3a66ebae06
[BoundsSafety][doc] Fix a typo (#126247) 2025-02-10 15:55:27 +08:00
Sam Elliott
aebe6c5d7f
[RISCV] Improve Errors for X1/X5/X1X5 Reg Classes (#126184)
LLVM has functionality for producing a register-class-specific error
message in the assembly parser, rather than just emitting the generic
"invalid operand for instruction" error.

This starts the gradual adoption of this functionality for RISC-V, with
some lesser-used shadow-stack register classes:
- GPRX1 (only contains `ra`)
- GPRX5 (only contains `t0`)
- GPRX1X5 (only contains `ra` and `t0`)

LLVM is reasonably conservative about when these errors are used, in
particular you have to have all the features for the relevant mnemonic
enabled before it will do, hence the test updates.

This also merges a pair of almost identical rv32/rv64 test files into a
single file with one run line.
2025-02-09 21:35:32 -08:00
Shilei Tian
70fdd9f0a2
[GlobalISel] Check whether G_CTLZ is legal in matchUMulHToLShr (#126457)
We need to check `G_CTLZ` because the combine uses `G_CTLZ` to get log
base 2,
and it is not always legal for on a target.

Fixes SWDEV-512440.
2025-02-10 00:11:09 -05:00
Brad Smith
55632404bd
[benchmark] Sync a few commits from upstream to help with CPU count (#126410)
Try to use the _SC_NPROCESSORS_ONLN sysconf elsewhere
(cherry picked from commit edb1e76d8cb080a396c7c992e5d4023e1a777bd1)

Replace usage of deprecated sysctl on macOS
(cherry picked from commit faaa266d33ff203e28b31dd31be9f90c29f28d04)

Retrieve the number of online CPUs on OpenBSD and NetBSD
(cherry picked from commit 41e81b1ca4bbb41d234f2d0f2c56591db78ebb83)

Update error message now that /proc/cpuinfo is no longer in use
(cherry picked from commit c35af58b61daa111c93924e0e7b65022871fadac)

Fix runtime crash when parsing /proc/cpuinfo fails
(cherry picked from commit 39be87d3004ff9ff4cdf736651af80c3d15e2497)

another reversal of something that breaks on wasm
(cherry picked from commit 44507bc91ff9a23ad8ad4120cfc6b0d9bd27e2ca)
2025-02-10 00:06:25 -05:00
Mikołaj Piróg
161cfc6f39
[AVX10.2] Fix wrong intrinsic names after rename (#126390)
In my previous PR (#123656) to update the names of AVX10.2 intrinsics
and mnemonics, I have erroneously deleted `_ph` from few intrinsics.
This PR corrects this.
2025-02-10 12:48:02 +08:00