527289 Commits

Author SHA1 Message Date
Simon Pilgrim
085bdb1e4c
[X86] canonicalizeShuffleWithOp - don't bother trying to move shuffles across binops to fold the load. (#126894)
Its not currently used, but is likely to just introduce additional shuffles, resulting in higher Port5 pressure etc. in future patches.
2025-02-12 12:13:11 +00:00
Paul Walker
563d54569e [NFC][LLVM][LangRef] Fix typos within partial.reduce.add documentation. 2025-02-12 11:51:26 +00:00
Jonathan Thackray
036f543952
[AArch64] Pre-commit tests for #125686 (NFC) (#126643)
Update the `generate-tests.py` script to create new tests for `atomicrmw
{fadd,fmin,fmax}` and test these with `half`, `float`, `bfloat` and
`double`.

Generate fp auto-tests to check both with and without `+lsfe`, so that when
#125686 is merged, `+lsfe` will use a single atomic floating-point
instruction.
2025-02-12 11:48:11 +00:00
Frank Schlimbach
0fd50ec9a3
[MLIR][mesh] Mesh fixes (#124724)
A collection of fixes to the mesh dialect
- allow constants in sharding propagation/spmdization
- fixes to tensor replication (e.g. 0d tensors)
- improved canonicalization
- sharding propagation incorrectly generated too many ShardOps
New operation `mesh.GetShardOp` enables exchanging sharding information
(like on function boundaries)
2025-02-12 12:44:48 +01:00
Ivan Butygin
0e779ad499
Revert "[mlir] ArithToLLVM: fix memref bitcast lowering" (#126895)
Reverts llvm/llvm-project#125148

bot failures
2025-02-12 14:34:16 +03:00
Paul Walker
01afa8fc0b
[NFC][LLVM][LangRef] Improve documentation for partial.reduce.add. (#126728) 2025-02-12 11:33:24 +00:00
Ivan Butygin
79010e2e4d
[mlir] ArithToLLVM: fix memref bitcast lowering (#125148)
`arith.bitcast` is allowed on memrefs and such code can actually be
generated by IREE `ConvertBf16ArithToF32Pass`.
`LLVM::detail::vectorOneToOneRewrite` doesn't properly check its types
and will generate bitcast between structs which is illegal.

With the opaque pointers this is a no-op operation for memref so we can
just add type check in `LLVM::detail::vectorOneToOneRewrite` and add a
separate pattern which removes op if converted types are the same.
2025-02-12 14:19:13 +03:00
David Green
bf7af2d12e
[AArch64][DAG] Allow fptos/ui.sat to scalarized. (#126799)
We we previously running into problems with fp128 types and certain
integer sizes.

Fixes an issue reported on #124984
2025-02-12 11:04:08 +00:00
Donát Nagy
edbc1fb228
[analyzer] Add option assume-at-least-one-iteration (#125494)
This commit adds the new analyzer option
`assume-at-least-one-iteration`, which is `false` by default, but can be
set to `true` to ensure that the analyzer always assumes at least one
iteration in loops.

In some situations this "loop is skipped" execution path is an important
corner case that may evade the notice of the developer and hide
significant bugs -- however, there are also many situations where it's
guaranteed that at least one iteration will happen (e.g. some data
structure is always nonempty), but the analyzer cannot realize this and
will produce false positives when it assumes that the loop is skipped.

This commit refactors some logic around the implementation of the new
feature, but the only functional change is introducing the new analyzer
option. If the new option is left in its default state (false), then the
analysis is functionally equivalent to an analysis done with a version
before this commit.
2025-02-12 11:56:02 +01:00
Mikhail Goncharov
d51750dba1 [bazel] port c03325cead2244ef0a89bb1cf365bddf16021daf 2025-02-12 11:33:49 +01:00
Mikhail Goncharov
5fe37ff75a Revert "[NVPTX] Cleanup/Refactoring in NVPTX AsmPrinter and RegisterInfo (NFC) (#126800)"
This reverts commit 215fa9e175c6ef9e2fa92f77fbd4015cd4c99a67.

getNameOrAsOperand is only defined under DEBUG
2025-02-12 11:13:16 +01:00
Simon Pilgrim
f73ed3d434
[X86] lowerShuffleAsBroadcast - use isShuffleEquivalent to search for a hidden broadcast pattern (#126517)
lowerShuffleAsBroadcast only matches a known-splat shuffle mask, but we
can use the isShuffleEquivalent/IsElementEquivalent helpers to attempt
to find a hidden broadcast-able shuffle pattern.

This requires an extension to IsElementEquivalent to peek through
bitcasts to match against wider shuffles - these typically appear during
shuffle lowering where we've widened a preceding shuffle, often to a
vector concatenation etc.

Amazingly I hit this while yak shaving #126033 .......
2025-02-12 10:12:20 +00:00
Kareem Ergawy
32faf43878
[flang][OpenMP] Handle fixed length charaters in delayed privatization (#126704)
We currently handle sequences of fixed-length arrays properly by **not**
emitting length parameters for `embox` ops inside the `omp.private` op.
However, we do not handle the scalar case. This PR extends
`getLengthParameters` defined in `PrivateReductionUtils.cpp` to handle
such cases.

Fixes issue reported in #125732.
2025-02-12 11:04:26 +01:00
Pavel Labath
37f36cbffb
[lldb] Support disassembling discontinuous functions (#126505)
The command already supported disassembling multiple ranges, among other
reasons because inline functions can be discontinuous. The main thing
that was missing was being able to retrieve the function ranges from the
top level function object.

The output of the command for the case where the function entry point is
not its lowest address is somewhat confusing (we're showing negative
offsets), but it is correct.
2025-02-12 10:47:22 +01:00
Nikita Popov
73413bd6a3 [mlir] Add missing dependency
After #126745, we should also depend on the Analysis component.
2025-02-12 10:26:41 +01:00
Yeaseen
c174cc4840
[llvm] Remove br i1 undef in some llvm/test/CodeGen tests (#126811)
This PR replaces some instances of `br i1 undef` with function argument
value in several tests under `llvm/test/CodeGen/` directory.
2025-02-12 09:19:00 +00:00
Nikita Popov
c03325cead
[MLIR][LLVMIR] Use TargetFolder when creating globals (#126745)
The LLVM dialect lowers globals using IRBuilder, relying on it creating
constant expressions where possible. As we remove support for more
constant expressions (per
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179),
this can cause issues for cases where the constant expression is no
longer supported, and the operation cannot be constant folded without
DataLayout being available. In particular, I ran into this issue with
flang and the removal of mul constant expressions.

Address this by using TargetFolder when creating globals, which will
perform DL-aware constant folding. I think it would make sense to also
do this in general, but I'm starting with globals where not doing this
can result in translation failures.

Ideally, globals with these problematic expressions would never be
generated in the first place, but there has been little movement on
fixing this (https://github.com/llvm/llvm-project/issues/96047).
2025-02-12 10:14:00 +01:00
Simon Pilgrim
8359dbc8c0
[X86] combineEXTRACT_SUBVECTOR - fold extract_subvector(subv_broadcast_load(ptr),0) -> load(ptr) (#126523)
This is typically handled by SimplifyDemandedVectorElts, but this will
fail when there are multiple uses of the subv_broadcast_load node, but
if there's just one use of the load result (and the rest are uses of the
memory chain), we can still replace with a load and update the chain
accordingly.

Noticed on #126517
2025-02-12 09:07:06 +00:00
Timm Baeder
20506a0a15
[clang][bytecode] Fix operator new source expression (#126870)
... for composite element types. Looks like I forgot this in
e6030d389571b3f1b0f0c5a35b7fa45937ed0f6c
2025-02-12 10:05:49 +01:00
Fraser Cormack
25c0554166
[libclc] Move conversion builtins to the CLC library (#124727)
This commit moves the implementations of conversion builtins to the CLC
library. It keeps the dichotomy of regular vs. clspv implementations of
the conversions. However, for the sake of a consistent interface all CLC
conversion routines are built, even the ones that clspv opts out of in
the user-facing OpenCL layer.

It simultaneously updates the python script to use f-strings for
formatting.
2025-02-12 08:55:02 +00:00
jeanPerier
65075a863b
[flang][FIR] handle argument attributes in fir.call (#126711)
Add pretty printer/parser for fir.call argument/result attributes and
propagate them to llvm.call.

This will allow implementing the TODO about ABI relevant argument
attribute in indirect calls.
2025-02-12 09:49:52 +01:00
Louis Dionne
39f0f0a21b
[libc++] Remove obsolete guards for join_view being experimental (#126697)
These TODOs were forgotten when join_view was made non-experimental. By
removing these checks, we slightly increase the coverage of the test
suite.
2025-02-12 09:44:49 +01:00
Nikita Popov
0abe058d7f
[BOLT] Use getMainExecutable() (#126698)
Use LLVM's getMainExecutable() helper instead of rolling our own. This
will result in standard behavior across platforms, such as making sure
that symlinks are always resolved.
2025-02-12 09:44:26 +01:00
Alex MacLean
215fa9e175
[NVPTX] Cleanup/Refactoring in NVPTX AsmPrinter and RegisterInfo (NFC) (#126800) 2025-02-12 00:23:36 -08:00
Adam Siemieniuk
0b9b014be7
[mlir][dlti] Query by strings (#126716)
Adds DLTI utility to query using strings directly as keys.
2025-02-12 09:13:43 +01:00
Amit Kumar Pandey
46f1bab793
Reapply "[Driver][ROCm][OpenMP] Fix default ockl linking for OpenMP."… (#126671)
- This reverts commit
0c6c4a9993.
  - Add '-mcode-object-version=5' as to explicitly use code object
    version 5 to match with 'FAIL' diagnostic.
  - Add Requires directive to support lit test run on platforms
    registered with x86_64 and amdgpu.
2025-02-12 13:40:51 +05:30
Craig Topper
7dd82805d5
[SelectionDAGBuilder] Remove NodeMap updates from getValueImpl. NFC (#126849)
Both callers already put the result in NodeMap immediately after the
call.
2025-02-12 00:07:07 -08:00
Owen Pan
3ca9238cb0 [clang-format][NFC] Fix test case format 2025-02-11 23:58:53 -08:00
Vitaly Buka
be98428374
[NFC][Pipelines] Extract buildCoroConditionalWrapper (#126860)
Helper for #126168.

`Phase` will be used in followup patches.
2025-02-11 23:54:07 -08:00
Matt Arsenault
de968c8e1c
AMDGPU: Use range to implement getSubRegs (#126861)
Fixes #126781
2025-02-12 14:05:42 +07:00
Ethan Luis McDonough
52ee06d273
[PGO][Offload] Fix pgo1.c (#126864)
pgo1.c had outdated test checks
2025-02-12 00:54:31 -06:00
Haohai Wen
ec28e9b757
[MC] Replace MCContext::GenericSectionID with MCSection::NonUniqueID (#126202)
They have same semantics. NonUniqueID is more friendly for isUnique
implementation in MCSectionELF.

History: 97837b7 added support for unique IDs in sections and added
GenericSectionID. Later, 1dc16c7 added NonUniqueID.
2025-02-12 14:28:37 +08:00
Sam Elliott
d222488007
[AsmParser] Remove OperandMatchResultTy (#126650)
This has been deprecated since a479be0f39a3301e9ca634d37cf6454b6d3865c6
from September 2023, before LLVM 18. Surely now enough release cycles
have happened that it can be removed upstream.
2025-02-11 21:59:05 -08:00
Vikram Hegde
9c725ef368
[AMDGPU][NewPM] Port "GCNRewritePartialRegUses" pass to NPM (#126024) 2025-02-12 11:21:40 +05:30
Ethan Luis McDonough
9e5c136d5a
[PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (#93365)
This pull request is the second part of an ongoing effort to extends PGO
instrumentation to GPU device code and depends on #76587. This PR makes
the following changes:

- Introduces `__llvm_write_custom_profile` to PGO compiler-rt library.
This is an external function that can be used to write profiles with
custom data to target-specific files.
- Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so
that it can write the collected data to a profraw file.
- Adds `PGODump` debug flag and only displays dump when the
aforementioned flag is set
2025-02-11 23:30:54 -06:00
Kazu Hirata
84e3c6ff95 [RISCV] Fix a warning
THis patch fixes:

  llvm/lib/Target/RISCV/RISCVVMV0Elimination.cpp:91:29: error: unused
  variable 'TRI' [-Werror,-Wunused-variable]
2025-02-11 20:37:48 -08:00
Abhishek Kaushik
df2dca7a73
[MC] Use std::move to avoid copy (#126700) 2025-02-12 10:01:30 +05:30
Jim Lin
31bfae35d2
[DAGCombiner] Add hasOneUse checks for folding (not (add X, -1)) to (neg X) (#126667)
To get more better codegen for AArch with bic, x86 with andn and riscv
with andn.
2025-02-12 12:24:29 +08:00
LLVM GN Syncbot
caa9fae2e7 [gn build] Port cc7e83601d75 2025-02-12 04:14:13 +00:00
Hongtao Yu
4a63ff4330
Revert "[mlir] Enable LICM for ops with only read side effects in scf.for" (#126840)
Reverts llvm/llvm-project#120302
2025-02-11 20:07:21 -08:00
Luke Lau
cc7e83601d
[RISCV] Select mask operands as virtual registers and eliminate uses of vmv0 (#125026)
This is another attempt at #88496 to keep mask operands in SSA after
instruction selection.

Previously we selected the mask operands into vmv0, a singleton register
class with exactly one register, V0.

But the register allocator doesn't really support singleton register
classes and we ran into errors like "ran out of registers during
register allocation in function".

This avoids this by introducing a pass just before register allocation
that converts any use of vmv0 to a copy to $v0, i.e. what isel currently
does today.

That way the register allocator doesn't need to deal with the singleton
register class, but we get the benefits of having the mask registers in
SSA throughout the backend:

- This allows RISCVVLOptimizer to reduce the VLs of instructions that
define mask registers
- It enables CSE and code sinking in more places
- It removes the need to peek through mask copies in RISCVISelDAGToDAG
and keep track of V0 defs in RISCVVectorPeephole

This patch initially eliminates uses of vmv0s after RISCVVectorPeephole
to keep the diff to a minimum, and a follow up patch will move it past
the other MachineInstr SSA passes.

Note that it doesn't try to remove any defs of vmv0 as we shouldn't have
any instructions that have any vmv0 outputs.

As a further follow up, we can move the elimination pass to after phi
elimination and outside of SSA, which would unblock the pre-RA scheduler
around masked pseudos. This might also help the issue that
RISCVVectorMaskDAGMutation tries to solve.
2025-02-12 12:06:55 +08:00
Miguel A. Arroyo
acd34d90d3
[Clang][CMake][MSVC] Install PDBs alongside executables (#126675)
* Follows up on https://github.com/llvm/llvm-project/pull/120683
enabling PDBs for `clang`.
2025-02-11 19:41:34 -08:00
Jie Fu
a0fbc19ad6 [MemorySanitizer] Silence an unused-variable warning (NFC)
/llvm-project/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:2622:22:
 error: unused variable 'ReturnType' [-Werror,-Wunused-variable]
    FixedVectorType *ReturnType = cast<FixedVectorType>(I.getType());
                     ^
1 error generated.
2025-02-12 11:32:51 +08:00
Thurston Dang
bfbe5319a8
[msan] Add handlePairwiseShadowOrIntrinsic and use it to handle Arm NEON pairwise add (#126008)
This patch adds a function, handlePairwiseShadowOrIntrinsic that ORs
pairs of adjacent shadow values; this is suitable for propagating shadow
for 1- or 2-vector intrinsics that combine adjacent fields. It then
applies handlePairwiseShadowOrIntrinsic to Arm NEON pairwise add:
llvm.aarch64.neon.{addhn, raddhn} (currently incorrectly handled) and
llvm.aarch64.neon.{saddlp, uaddlp} (currently suboptimally handled).

Updates the tests from https://github.com/llvm/llvm-project/pull/125820.
2025-02-11 19:13:18 -08:00
c8ef
6de4de8931
[libc] implement endian related macros (#126368)
Follow up of #125168.

This patch adds endian-related macros to `endian.h`. We utilize compiler
built-ins for byte swap functions, which are already included in our
minimal supported compiler version.
2025-02-12 10:17:09 +08:00
lonely eagle
82cbb02cbc
[mlir][vector][NFC] Fix typos in tests (#126662)
[mlir][vector] Fix typos in tests (nfc)

Fix typos in `{insert|extract}_scalar_from_vec_2d_f32_dynamic_idxs_compile_time_constant` - the intention was to use `f32` rather than `i32`.
2025-02-12 10:05:32 +08:00
donald chen
f15a6c99fa
[mlir] [DataFlow] Fix bug in int-range-analysis (#126708)
When querying the lower bound and upper bound of loop to update the
value range of a loop iteration variable, the program point to depend on
should be the block corresponding to the iteration variable rather than
the loop operation.
2025-02-12 09:58:56 +08:00
Christopher Ferris
9db0f91ceb
[scudo] Modify header corrupption error message (#126812)
Update the error message to be explicit that this is likely due to
memory corruption.

In addition, check if the chunk header is all zero, which could mean
corruption or an attempt to free a pointer after the memory has been
released to the kernel. This case results in a slightly different error
message to also indicate this could still be a double free.
2025-02-11 17:41:15 -08:00
Krzysztof Drewniak
934c97dd16
[LowerBufferFatPointers] Fix support for GEP T, p7, <N x T> idxs (#126126)
The lowering for GEP didn't properly support the case where the pointer
argument was being implicitly broadcast by a vector of indices. Fix
that.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-02-11 18:22:50 -06:00
Michael Jones
574ccc6d1b
[libc] fix get_epoch constexpr error (#126818)
get_epoch calls mktime_internal which isn't constexpr. For now, just
remove the constexpr from get_epoch.
2025-02-11 16:08:20 -08:00