517408 Commits

Author SHA1 Message Date
Yingwei Zheng
0b9f1cc024
[SCEV] Disallow simplifying phi(undef, X) to X (#115109)
See the following case:
```
@GlobIntONE = global i32 0, align 4

define ptr @src() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  %retval.0 = phi ptr [ %retval.2.peel, %entry.peel.newph ], [ %retval.2, %cleanup ]
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  %retval.2 = phi ptr [ %retval.0, %for.body ], [ @GlobIntONE, %cleanup.loopexit ]
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}

define ptr @tgt() {
entry:
  br label %for.body.peel.begin

for.body.peel.begin:                              ; preds = %entry
  br label %for.body.peel

for.body.peel:                                    ; preds = %for.body.peel.begin
  br i1 true, label %cleanup.peel, label %cleanup.loopexit.peel

cleanup.loopexit.peel:                            ; preds = %for.body.peel
  br label %cleanup.peel

cleanup.peel:                                     ; preds = %cleanup.loopexit.peel, %for.body.peel
  %retval.2.peel = phi ptr [ undef, %for.body.peel ], [ @GlobIntONE, %cleanup.loopexit.peel ]
  br i1 true, label %for.body.peel.next, label %cleanup7

for.body.peel.next:                               ; preds = %cleanup.peel
  br label %for.body.peel.next1

for.body.peel.next1:                              ; preds = %for.body.peel.next
  br label %entry.peel.newph

entry.peel.newph:                                 ; preds = %for.body.peel.next1
  br label %for.body

for.body:                                         ; preds = %cleanup, %entry.peel.newph
  br i1 false, label %cleanup, label %cleanup.loopexit

cleanup.loopexit:                                 ; preds = %for.body
  br label %cleanup

cleanup:                                          ; preds = %cleanup.loopexit, %for.body
  br i1 false, label %for.body, label %cleanup7.loopexit

cleanup7.loopexit:                                ; preds = %cleanup
  %retval.2.lcssa.ph = phi ptr [ %retval.2.peel, %cleanup ]
  br label %cleanup7

cleanup7:                                         ; preds = %cleanup7.loopexit, %cleanup.peel
  %retval.2.lcssa = phi ptr [ %retval.2.peel, %cleanup.peel ], [ %retval.2.lcssa.ph, %cleanup7.loopexit ]
  ret ptr %retval.2.lcssa
}
```
1. `simplifyInstruction(%retval.2.peel)` returns `@GlobIntONE`. Thus,
`ScalarEvolution::createNodeForPHI` returns SCEV expr `@GlobIntONE` for
`%retval.2.peel`.
2. `SimplifyIndvar::replaceIVUserWithLoopInvariant` tries to replace the
use of `%retval.2.peel` in `%retval.2.lcssa.ph` with `@GlobIntONE`.
3. `simplifyLoopAfterUnroll -> simplifyLoopIVs -> SCEVExpander::expand`
reuses `%retval.2.peel = phi ptr [ undef, %for.body.peel ], [
@GlobIntONE, %cleanup.loopexit.peel ]` to generate code for
`@GlobIntONE`. It is incorrect.

This patch disallows simplifying `phi(undef, X)` to `X` by setting
`CanUseUndef` to false.
Closes https://github.com/llvm/llvm-project/issues/114879.
2024-11-07 15:53:51 +08:00
Pengcheng Wang
3850801ca5
[RISCV] Add vcpop.m/vfirst.m to RISCVMaskedPseudosTable
We seem to forget these two instructions.

Reviewers: preames, frasercrmck, lukel97, topperc

Reviewed By: lukel97

Pull Request: https://github.com/llvm/llvm-project/pull/115162
2024-11-07 15:41:46 +08:00
Younan Zhang
adb0d8ddce
[Clang] Distinguish expanding-pack-in-place cases for SubstTemplateTypeParmTypes (#114220)
In 50e5411e4, we preserved the pack substitution index within
SubstTemplateTypeParmType nodes and performed in-place expansions of
packs such that type constraints on a lambda that serve as a pattern of
a fold expression could be evaluated if the type constraints contain any
packs that are expanded by the fold expression.

However, we made an incorrect assumption of the condition under which
in-place expansion should occur. For example, a SizeOfPackExpr case
relies on SubstTemplateTypeParmType nodes being transformed to
SubstTemplateTypeParmPackTypes rather than expanding them immediately in
place.

This fixes that by adding a flag to SubstTemplateTypeParmType to
discriminate such in-place expansion situations.

Fixes https://github.com/llvm/llvm-project/issues/113518
2024-11-07 15:37:14 +08:00
Haojian Wu
9f796159f2
Add clang::lifetimebound annotation to llvm::function_ref (#115019)
This helps catch dangling llvm::function_ref references, see #114950,
#114949, #114808, #114789
2024-11-07 08:24:54 +01:00
Luke Lau
343a810725
[RISCV] Allow f16/bf16 with zvfhmin/zvfbfmin as legal strided access (#115264)
This is also split off from the zvfhmin/zvfbfmin
isLegalElementTypeForRVV work.

Enabling this will cause SLP and RISCVGatherScatterLowering to emit
@llvm.experimental.vp.strided.{load,store} intrinsics, and codegen
support for this was added in #109387 and #114750.
2024-11-07 14:40:15 +08:00
Fangrui Song
9b058bb42d [ELF] Replace errorOrWarn(...) with Err 2024-11-06 22:33:51 -08:00
Fangrui Song
f8bae3af74 [ELF] Replace warn(...) with Warn 2024-11-06 22:19:31 -08:00
Fangrui Song
09c2c5e1e9 [ELF] Replace error(...) with ErrAlways or Err
Most are migrated to ErrAlways mechanically.
In the future we should change most to Err.
2024-11-06 22:04:52 -08:00
Fangrui Song
63c6fe4a0b [ELF] Replace fatal(...) with Fatal or Err 2024-11-06 21:17:26 -08:00
vporpo
f7ef7b2ff7
[SandboxVec][Scheduler] Implement rescheduling (#115220)
This patch adds support for re-scheduling already scheduled
instructions. For now this will clear and rebuild the DAG, and will
reschedule the code using the new DAG.
2024-11-06 20:59:49 -08:00
Jeffrey Byrnes
ae6dbed594
[AMDGPU] Use correct DWord for v_dot4 S0 operand (#115224)
Fixes a copy-paste typo.

The typo resulted in producing bad v_perm based operands for the v_dot4
combine. When adding a corresponding byte pair to the v_dot byte pair
chains, we must take note of the byte position in the corresponding
source nodes. These byte positions are used to ensure we extract the
correct DWord from the ultimate source, and formulate a correct
perm_mask from the extracted DWord.

With the typo, we the S0 byte would used the DWord offset for the
corresponding S1 byte. If this offset was not the same as the true DWord
offset for the S0 byte, we would extract and use the wrong byte for S0
in the v_dot.

Fixes https://github.com/llvm/llvm-project/issues/112941
2024-11-06 20:48:20 -08:00
Luke Lau
f0e2301b7c
[RISCV] Allow f16/bf16 with zvfhmin/zvfbfmin as legal interleaved access (#115257)
This is another piece split off from the work to add zvfhmin/zvfbfmin to
isLegalElementTypeForRVV.
This is needed to get InterleavedAccessPass to lower [de]interleaves to
segment load/stores.
2024-11-07 12:35:59 +08:00
Luke Lau
481ff22b8b
[RISCV] Lower fixed-length vp_{gather,scatter} for zvfhmin/zvfbfmin (#115253)
This uses the same lowering as masked gathers and scatters.
2024-11-07 12:28:13 +08:00
Sergei Barannikov
3bdd71137e
[TableGen][GISel] Extract helper function for constraining operands (#115148)
As a side effect, this fixes COPY_TO_REGCLASS not being constrained
if it is not top-level (the reason for changes in tests).
2024-11-07 07:16:54 +03:00
Craig Topper
da032b7903
[RISCV][GISel] Use maskedValueIsZero in RISCVInstructionSelector::selectZExtBits. (#115244) 2024-11-06 20:14:24 -08:00
Han-Kuan Chen
c6091cdbed
[SLP][REVEC] Make shufflevector can be vectorized with ReorderIndices and ReuseShuffleIndices. (#114965) 2024-11-07 11:04:34 +08:00
Luke Lau
70bc12e77f [RISCV] Remove unnecessary scalar extensions from test. NFC
Now that f16 and bf16 aren't being scalarized we don't need
zfhmin/zfbfmin.
2024-11-07 10:54:02 +08:00
Richard Smith
de18fa1ace
Don't redundantly specify the default template argument to BumpPtrAllocatorImpl (#114857) 2024-11-06 18:45:27 -08:00
Luke Lau
05f87b2d65
[RISCV] Lower fixed-length mload/mstore for zvfhmin/zvfbfmin (#115145)
This is the same idea as #114945.
2024-11-07 10:41:03 +08:00
Luke Lau
7cb66772e2 [RISCV] Rework fixed-length masked load/store tests. NFC
Pass in the mask and vector directly as arguments, and add tests for
zvfhmin and zvfbfmin.
2024-11-07 10:38:21 +08:00
Diego Caballero
af5c471a4d
[mlir][Vector] Add vector.extract(vector.shuffle) folder (#115105)
This PR adds a folder for extracting an element from a vector shuffle.
It turns something like:

```
   %shuffle = vector.shuffle %a, %b [0, 8, 7, 15]
     : vector<8xf32>, vector<8xf32>
   %extract = vector.extract %shuffle[3] : f32 from vector<4xf32>
```

into:

```
   %extract = vector.extract %b[7] : f32 from vector<8xf32>
```
2024-11-06 18:17:12 -08:00
Valentin Clement (バレンタイン クレメン)
30d80009e5
[flang][cuda] Allow SHARED actual to DEVICE dummy (#115215)
Update the compatibility rules to allow SHARED actual argument passed to
DEVICE dummy argument. Emit a warning in that case.
2024-11-06 17:45:58 -08:00
Matt Arsenault
29a5c054e6
ValueTracking: Allow getUnderlyingObject to look at vectors (#114311)
We can identify some easy vector of pointer cases, such as
a getelementptr with a scalar base.
2024-11-06 17:14:44 -08:00
Craig Topper
7c82875866
[GISel][RISCV][AMDGPU] Add G_SHL, G_LSHR, G_ASHR to binop_left_to_zero. (#115089)
Shifting 0 by any amount is still zero.
2024-11-06 17:03:04 -08:00
Konstantin Schwarz
cbfe87c253
[GlobalISel] Remove references to rhs of shufflevector if rhs is undef (#115076) 2024-11-06 16:36:13 -08:00
Kazu Hirata
5348a30a58
[ExecutionEngine] Simplify code with DenseMap::operator[] (NFC) (#115115) 2024-11-06 16:33:34 -08:00
Kazu Hirata
84745da74c [Analysis] Fix a warning (NFC)
This patch fixes:

  third-party/unittest/googletest/include/gtest/gtest.h:1379:11:
  error: comparison of integers of different signs: 'const unsigned
  int' and 'const int' [-Werror,-Wsign-compare]
2024-11-06 16:26:27 -08:00
Ryan Mansfield
bbc3af0577
[dsymutil] Add missing newlines in error messages. (#115191)
Errors like "cannot create bundle: Not a directory" or "error:
a.out.dSYM: Is a directory" were being emitted without a newline.
2024-11-06 15:54:47 -08:00
Yingwei Zheng
cacbe71af7
[Analysis] Avoid running transform passes that have just been run (#112092)
This patch adds a new analysis pass to track a set of passes and their
parameters to see if we can avoid running transform passes that have
just been run. The current implementation only skips redundant
InstCombine runs. I will add support for other passes in follow-up
patches.

RFC link:
https://discourse.llvm.org/t/rfc-pipeline-avoid-running-transform-passes-that-have-just-been-run/82467

Compile time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=76007138f4ffd4e0f510d12b5e8cad529c21f24d&to=64134cf07ea7eb39c60320087c0c5afdc16c3a2b&stat=instructions%3Au
2024-11-07 07:52:14 +08:00
Augusto Noronha
f6617d65e4
[DebugInfo] Add num_extra_inhabitants to debug info (#112590)
An extra inhabitant is a bit pattern that does not represent a valid
value for instances of a given type. The number of extra inhabitants is
the number of those bit configurations.

This is used by Swift to save space when composing types. For example,
because Bool only needs 2 bit patterns to represent all of its values
(true and false), an Optional<Bool> only occupies 1 byte in memory by
using a bit configuration that is unused by Bool. Which bit patterns are
unused are part of the ABI of the language.

Since Swift generics are not monomorphized, by using dynamic libraries
you can have generic types whose size, alignment, etc, are known only
at runtime (which is why this feature is needed).

This patch adds num_extra_inhabitants to LLVM-IR debug info and in DWARF
as an Apple extension.
2024-11-06 15:48:04 -08:00
Jonas Devlieghere
bd3a3959dc
[lldb] Fix deprecated defines in debugserver (XROS -> VISIONOS) (NFC) 2024-11-06 15:16:20 -08:00
Alexander Richardson
d08772b151
Revert "[libc++abi] Stop copying headers to the build directory" (#115232)
Reverts llvm/llvm-project#115086

2-stage sanitizer build is not happy:
https://lab.llvm.org/buildbot/#/builders/25/builds/3915
2024-11-06 15:07:30 -08:00
Min-Yih Hsu
7ef7c0d036
[RISCV] Refine vector division latencies in SiFive P600's scheduling model (#115038)
For both vector integer and floating point divisions.

Co-authored-by: Yeting Kuo <yeting.kuo@sifive.com>
2024-11-06 14:53:42 -08:00
Jan Svoboda
a6637ae2cc
[clang][deps] Share FileManager between modules (#115065)
The `FileManager` sharing between module-building `CompilerInstance`s
was disabled a while ago due to `FileEntry::getName()` being unreliable.
Now that we use `FileEntryRef::getNameAsRequested()` in places where it
matters, re-enabling `FileManager` is sound and improves performance of
`clang-scan-deps` by ~6.2%.
2024-11-06 14:21:01 -08:00
Pranav Kant
df0a56cdd9
[bazel] Fix AMXDialect (#115221) 2024-11-06 14:19:16 -08:00
Martin Storsjö
87f4bc0aca
[compiler-rt] [fuzzer] Skip trying to set the thread name on MinGW (#115167)
Since b4130bee6bfd34d8045f02fc9f951bcb5db9d85c, we check for
_LIBCPP_HAS_THREAD_API_PTHREAD to decide between using
SetThreadDescription or pthread_setname_np for setting the thread name.

c6f3b7bcd0596d30f8dabecdfb9e44f9a07b6e4c changed how libcxx defines
their configuration macros - now they are always defined, but defined to
0 or 1, while they previously were either defined or undefined.

As these libcxx defines used to be defined to an empty string (rather
than expanding to 1) if enabled, we can't easily produce an expression
that works both with older and newer libcxx. Additionally, these defines
are libcxx internal config macros that aren't a detail that isn't
supported and isn't meant to be relied upon.

Simply skip trying to set thread name on MinGW as we can't easily know
which kind of thread native handle we have. Setting the thread name is
only a nice to have, quality of life improvement - things should work
the same even without it.

Additionally, libfuzzer isn't generally usable on MinGW targets yet
(Clang doesn't include it in the getSupportedSanitizers() method for the
MinGW target), so this shouldn't make any difference in practice anyway.
2024-11-07 00:18:57 +02:00
Craig Topper
21ded66dba [RISCV][GISel] Add zexti8 ComplexPattern. 2024-11-06 13:58:25 -08:00
Gang Chen
f85be26a67
[AMDGPU] fix build error unused-var (#115199) 2024-11-06 13:46:01 -08:00
Pranav Kant
ff533b94b7
[bazel] Add dep to BuiltinDialectTdFiles (#115217) 2024-11-06 13:33:13 -08:00
vporpo
5942a99f8b
[SandboxVec] Notify scheduler about new instructions (#115102)
This patch registers the "createInstr" callback that notifies the
scheduler about newly created instructions. This guarantees that all
newly created instructions have a corresponding DAG node associated with
them. Without this the pass crashes when the scheduler encounters the
newly created vector instructions.

This patch also changes the lifetime of the sandboxir Ctx variable in
the SandboxVectorizer pass. It needs to be destroyed after the passes
get destroyed. Without this change when components like the Scheduler
get destroyed Ctx will have already been freed, which is not legal.
2024-11-06 13:26:14 -08:00
Valentin Clement (バレンタイン クレメン)
a878dc8fb3
[flang][cuda] Do not emit warning for SHARED variable in device subprogram (#115195)
SHARED attribute is explicitly meant to be used in device subprogram
(https://docs.nvidia.com/hpc-sdk/compilers/cuda-fortran-prog-guide/index.html#cfpg-var-qual-attr-shared).

Do not emit warning.
2024-11-06 13:19:09 -08:00
Jan Svoboda
304c412173 [clang][serialization] Reduce ASTWriter::writeUnhashedControlBlock() scope 2024-11-06 12:54:01 -08:00
Thurston Dang
4a6d13bf4d Remove unused variable to fix '[AMDGPU] modify named barrier builtins and intrinsics (#114550)'
https://github.com/llvm/llvm-project/pull/114550 caused a buildbot breakage (https://lab.llvm.org/buildbot/#/builders/66/builds/5853) because of an unused variable. This patch attempts to fix forward:

/home/b/sanitizer-x86_64-linux/build/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUMachineFunction.cpp:106:24: error: variable 'TTy' set but not used [-Werror,-Wunused-but-set-variable]
106 |     if (TargetExtType *TTy = AMDGPU::isNamedBarrier(GV)) {
    |                        ^
2024-11-06 20:49:39 +00:00
Andrzej Warzyński
e9bafa35d2
[mlir][tensor] Generalize/restrict GeneralizeOuterUnitDimsPackOpPattern (#114315)
This PR *restricts* `GeneralizeOuterUnitDimsPackOpPattern` to follow its
intended purpose (as per the documentation), which is to:

  > require all outer dimensions of tensor.pack to be 1.

There was one in-tree test that violated this assumption (and happened
to work) – see `@simple_KCRS_to_KRSCsr` in
"generalize-tensor-pack.mlir". That test has been updated to satisfy the
new requirements of the pattern.

By enforcing the pattern to follow its intended design (i.e., making it
stricter), the calculation of shapes and sizes for various Ops that the
pattern generates (PadOp, ExtractSliceOp, EmptyOp, TensorOp, and
InsertSliceOp) becomes much simpler and easier to document. This also
helped *generalize* the pattern to support cases like the one below:

```mlir
func.func @simple_pad_and_pack_dynamic_tile_cst(
    %src: tensor<5x1xf32>,
    %dest: tensor<1x1x?x2xf32>,
    %pad: f32) -> tensor<1x1x?x2xf32> {

  %tile_dim_0 = arith.constant 8 : index
  %0 = tensor.pack %src
    padding_value(%pad : f32)
    inner_dims_pos = [0, 1]
    inner_tiles = [%tile_dim_0, 2]
    into %dest : tensor<5x1xf32> -> tensor<1x1x?x2xf32>

  return %0 : tensor<1x1x?x2xf32>
}
```

Note that the inner tile slice is dynamic but compile-time constant.
`getPackOpSourceOrPaddedSource`, which is used to generate PadOp,
detects this and generates a PadOp with static shapes. This is a good
optimization, but it means that all shapes/sizes for Ops generated by
`GeneralizeOuterUnitDimsPackOpPattern` also need to be updated to be
constant/static. By restricting the pattern and simplifying the
size/shape calculation, supporting the case above becomes much easier.

Notable implementation changes:

* PadOp processes the original source (no change in dimensions/rank).
  ExtractSliceOp extracts the tile to pack and may reduce the rank. All
  following ops work on the tile extracted by ExtractSliceOp (possibly
  rank-reduced).
* All shape/size calculations assume that trailing dimensions match
  inner_tiles from tensor.pack. All leading dimensions (i.e., outer
  dimensions) are assumed to be 1.
* Dynamic sizes for ops like ExtractSliceOp are taken from inner_tiles
  rather than computed as, for example, tensor.dim %dest, 2. It’s the
  responsibility of the "producers" of tensor.pack to ensure that
  dimensions in %dest match the specified tile sizes.
2024-11-06 20:42:47 +00:00
Jan Svoboda
0276621f8f [clang][serialization] Reduce ASTWriter::WriteControlBlock() scope 2024-11-06 12:36:46 -08:00
Jan Svoboda
bcb64e1317 [clang][serialization] Reduce ASTWriter::WriteSourceManagerBlock() scope 2024-11-06 12:34:24 -08:00
Craig Topper
b57cbbcb6a [RISCV][GISel] Improve fptos/ui and s/uitofp handling and testing.
Replace clampScalar of the integer type with minScalar. We can't
narrow the integer type, we can only make it larger. If the type
is larger than xLen we need to use a 2*xlen libcall. If it's larger
than 2*xlen we can't handle it at all.
2024-11-06 12:18:56 -08:00
Justin Fargnoli
375d1925db
Revert "[NVPTX] Emit prmt selection value in hex" (#115204)
Reverts llvm/llvm-project#115049
2024-11-06 12:10:49 -08:00
Kazu Hirata
ccf5d624f9 [AMDGPU] Fix a warning
This patch fixes:

  llvm/lib/Target/AMDGPU/AMDGPULowerModuleLDSPass.cpp:1031:17: error:
  unused variable 'F' [-Werror,-Wunused-variable]
2024-11-06 12:08:27 -08:00
Alexander Richardson
5be02d7a03
[libc++abi] Stop copying headers to the build directory
This was needed before https://github.com/llvm/llvm-project/pull/115077
since the compiler-rt test build made assumptions about the build
layout of libc++ and libc++abi, but now they link against a local
installation of these libraries so we no longer need this workaround.

Reviewed By: ldionne

Pull Request: https://github.com/llvm/llvm-project/pull/115086
2024-11-06 11:59:37 -08:00