519376 Commits

Author SHA1 Message Date
Joseph Huber
b4d49fb52e
[libc] Remove RPC server API and use the header directly (#117075)
Summary:
This patch removes much of the `llvmlibc_rpc_server` interface. This
pretty much deletes all of this code and just replaces it with including
`rpc.h` directly. We still maintain the file to let `libc` handle the
opcodes, since those depend on the `printf` impelmentation.

This will need to be cleaned up more, but I don't want to put too much
into a single patch.
2024-11-25 07:13:28 -06:00
Paschalis Mpeis
957c2ac4f1
[BOLT] Fix for bughunter.sh in offline mode (#116649)
In offline mode, the script sets 'PASS' variable and does not use it.
Surrounding code suggests using 'FAIL' variable instead.
2024-11-25 13:13:10 +00:00
Paschalis Mpeis
4b71b3782d
[BOLT] DataAggregator support for binaries with multiple text segments (#92815)
When a binary has multiple text segments, the Size is computed as the
difference of the last address of these segments from the BaseAddress.
The base addresses of all text segments must be the same.

Introduces flag 'perf-script-events' for testing. It allows passing perf events
without BOLT having to parse them using 'perf script'. The flag is used to
pass a mock perf profile that has two memory mappings for a mock binary
that has two text segments. The size of the mapping is updated as this
change `parseMMapEvents` processes all text segments.
2024-11-25 13:12:43 +00:00
Igor Kirillov
b5a11d378d
[SelectOpt] Refactor to prepare for support more select-like operations (#115745)
* Enables conversion of several select-like instructions within one
group
* Any number of auxiliary instructions depending on the same condition
can be in between select-like instructions
* After splitting the basic block, move select-like instructions into
the relevant basic blocks and optimise them
* Make it easier to add support shift-base select-like instructions and
also any mixture of zext/sext/not instructions
2024-11-25 12:59:09 +00:00
Jefferson Le Quellec
8c5a3a97c0
[mlir][docs] Update MLIR's PatternRewriter documentation (#116183)
This PR adds the missing `const override` to the `rewrite` and
`matchAndRewrite` declaration in the Pattern Rewriter documentation as
described here:

5cfa8baef3/mlir/include/mlir/IR/PatternMatch.h (L237-L265)
2024-11-25 13:32:57 +01:00
Jay Foad
535247841d
[TableGen] Remove comments from generated validateOperandClass (#117352)
This generated comments like:

  // 'BoolReg' class
  case MCK_BoolReg: {

which seem redundant because the name is always repeated on the next
line as part of the MCK_ enumerator.
2024-11-25 12:11:01 +00:00
Timm Baeder
ceaf6e912a
[clang][bytecode] Support ImplicitValueInitExpr for multi-dim arrays (#117312)
The attached test case from
https://github.com/llvm/llvm-project/issues/117294 used to cause an
assertion because we called classifPrim() on an array type.

The new result doesn't crash but isn't exactly perfect either. Since the
problem arises when evaluating an ImplicitValueInitExpr, we have no
proper source location to point to. Point to the caller instead.
2024-11-25 12:15:31 +01:00
Nikita Popov
e477989a05
[InstCombine] Handle trunc i1 pattern in eq-of-parts fold (#112704)
Equality/inequality of the low bit can be represented by `(trunc (xor x,
y) to i1)`, possibly with an extra not. We have to handle this in the
eq-of-parts fold now that we no longer canonicalize this to a masked
icmp.

Proofs: https://alive2.llvm.org/ce/z/qidkzq

Fixes https://github.com/llvm/llvm-project/issues/110919.
2024-11-25 11:49:00 +01:00
ShashwathiNavada
612f8ec7ac
seq_cst is allowed in Flush since OpenMP 5.1. (#114072)
This PR adds support seq_cst (sequential consistency) clause for the
flush directive in OpenMP. The seq_cst clause enforces a stricter memory
ordering, ensuring that all threads observe the memory effects of the
flush in the same order, improving consistency in memory operations
across threads.

---------

Co-authored-by: Shashwathi N <nshashwa@pe28vega.hpc.amslabs.hpecorp.net>
Co-authored-by: CHANDRA GHALE <chandra.nitdgp@gmail.com>
2024-11-25 16:08:39 +05:30
Alex Voicu
48ec59c234
[llvm][AMDGPU] Fold llvm.amdgcn.wavefrontsize early (#114481)
Fold `llvm.amdgcn.wavefrontsize` early, during InstCombine, so that it's
concrete value is used throughout subsequent optimisation passes.
2024-11-25 10:29:50 +00:00
Luke Lau
15fadeb2aa
[RISCV] Add cost for @llvm.experimental.vp.splat (#117313)
This is split off from #115274. There doesn't seem to be an easy way to
share this with getShuffleCost since that requires passing in a real
insert_element operand to get it to recognise it's a scalar splat.

For i1 vectors we can't currently lower them so it returns an invalid
cost.

---------

Co-authored-by: Shih-Po Hung <shihpo.hung@sifive.com>
2024-11-25 11:28:46 +01:00
Nikita Popov
f81f47e3ff [InstCombine] Add fptrunc of max test (NFC)
To guard against regression from #117182.
2024-11-25 10:56:24 +01:00
David Green
c537c75278 [AArch64][GlobalISel] Scalarize i128 vector sadd_sat/uadd_sat/etc.
As with other operations we scalarize any vectors with larger types to let the
scalare legalization kick in.
2024-11-25 09:55:46 +00:00
David Spickett
84fec7757e [lldb][docs] Clarify unit for SVE P register size 2024-11-25 09:53:15 +00:00
LiqinWeng
db14010405
[RISCV][TTI] Implement cost of intrinsic abs with LMUL (#115813) 2024-11-25 17:35:58 +08:00
Nikita Popov
321fe74795 [InstCombine] Add extra test for eq of parts fold (NFC)
To guard against regression from #112704.
2024-11-25 10:27:35 +01:00
David Sherwood
22ec44f509
[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031)
For IR like this:

  %icmp = icmp ult <4 x i32> %a, splat (i32 5)
  %res = extractelement <4 x i1> %icmp, i32 1

where there is only one use of %icmp we can take a similar approach
to what we already do for binary ops such add, sub, etc. and convert
this into

  %ext = extractelement <4 x i32> %a, i32 1
  %res = icmp ult i32 %ext, 5

For AArch64 targets at least the scalar boolean result will almost
certainly need to be in a GPR anyway, since it will probably be
used by branches for control flow. I've tried to reuse existing code
in scalarizeExtractedBinop to also work for setcc.

NOTE: The optimisations don't apply for tests such as
extract_icmp_v4i32_splat_rhs in the file

CodeGen/AArch64/extract-vector-cmp.ll

because scalarizeExtractedBinOp only works if one of the input
operands is a constant.
2024-11-25 09:25:01 +00:00
Nikita Popov
e5faeb69fb
[InstCombine] Support reassoc for foldLogicOfFCmps (#116065)
We currently support simple reassociation for foldAndOrOfICmps().
Support the same for foldLogicOfFCmps() by going through the common
foldBooleanAndOr() helper.

This will also resolve the regression on #112704, which is also due to
missing reassoc support.

I had to adjust one fold to add support for FMF flag preservation,
otherwise there would be test regressions. There is a separate fold
(reassociateFCmps) handling reassociation for *just* that specific case
and it preserves FMF. Unfortunately it's not rendered entirely redundant
by this patch, because it handles one more level of reassociation as
well.
2024-11-25 10:21:38 +01:00
Piyou Chen
7d8d51ed34
Recommit "[TargetVersion] Only enable on RISC-V and AArch64" (#117110)" (#117128)
Remain InheritableAttr to avoid the warning `TypePrinter.cpp:1953:10:
warning: enumeration value ‘TargetVersion’ not handled in switch`

origin messenge

[TargetVersion] Only enable on RISC-V and AArch64 (#115991) Address
#115000.

This patch constrains the target_version feature to work only on RISC-V
and AArch64 to prevent crashes in Clang.

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2024-11-25 17:10:50 +08:00
David Green
2b5e2d74d3 [AArch64][GlobalISel] Extend arm64-vshift.ll test coverage. NFC 2024-11-25 09:03:50 +00:00
Markus Böck
d35098bfa8
[mlir][LLVM][NFC] Move LLVMStructType to ODS (#117485)
This PR extracts NFC changes out of
https://github.com/llvm/llvm-project/pull/116035 to reap as many of the
same benefits without any of the semantic changes.

More concretely, moving `LLVMStructType` to ODS has the benefits of
being able to generate much of the required boilerplate, such as
interface definitions, documentation and more, automatically.
Furthermore, `LLVMStructType` is then treated less special and its
definition can be found at the same place where all other complex type
definitions are found in the LLVM dialect.

Future changes could leverage more automatically generated code from
TableGen such as `assemblyFormat`. As these are not as trivial, they
have been left for future PRs.

---------

Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-11-25 10:02:53 +01:00
Nikita Popov
866755f8da
[LLVM] Update backend maintainer (#116622)
We currently list Evan Cheng as the fallback maintainer for the LLVM
backend. However, their last contribution dates back to 2014.

I'd like to nominate arsenm instead, who is our most active backend
reviewer.
2024-11-25 09:52:36 +01:00
Pavel Labath
1bc98957c8
[lldb/DWARF] Remove duplicate type filtering (#116989)
In #108907, the index classes started filtering the DIEs according to
the full type query (instead of just the base name). This means that the
checks in SymbolFileDWARF are now redundant.

I've also moved the non-redundant checks so that now all checking is
done in the DWARFIndex class and the caller can expect to get the final
filtered list of types.
2024-11-25 09:52:19 +01:00
Oliver Stannard
6512e488f6
[LLD][ARM] Allow R_ARM_SBREL32 relocations in debug info (#116956)
The R_ARM_SBREL32 relocation is used in debug info for ARM RWPI
(read-write position independent) code. Compiler-generated DWARF info
will use an expression to add the relocated value to the actual value of
the static base (held in r9) at run-time, so it should be relocated as
if the static base is at address 0.
2024-11-25 08:51:27 +00:00
Hans
55f5d68c2d
[win/asan] Recognize mov QWORD PTR [rip + X], reg (#117335)
This comes up when intercepting clang-built `__sanitizer_cov` functions.
2024-11-25 09:50:08 +01:00
Nikita Popov
3317c9ceac
[AMDGPU] Use getSignedConstant() where necessary (#117328)
Create signed constant using getSignedConstant(), to avoid future
assertion failures when we disable implicit truncation in getConstant().

This also touches some generic legalization code, which apparently only
AMDGPU tests.
2024-11-25 09:49:34 +01:00
Nikita Popov
815a1bb53a
[SystemZ] Use getSignedConstant() where necessary (#117181)
This will avoid assertion failures once we disable implicit truncation
in getConstant().

Inside adjustSubwordCmp() I ended up suppressing the issue with an
explicit cast, because this code deals with a mix of unsigned and signed
immediates.
2024-11-25 09:47:49 +01:00
Owen Pan
0fe12a7db3 [clang-format][NFC] Remove a pointer in ContinuationIndenter 2024-11-25 00:35:50 -08:00
Adrian Kuegel
404d0e9966 [mlir] Adjust code flagged by ClangTidyPerformance (NFC).
We can allocate the size of the vector in advance.
2024-11-25 08:17:09 +00:00
Younan Zhang
df335b09ea
[Clang] Preserve partially substituted pack indexing type/expressions (#116782)
Substituting into pack indexing types/expressions can still result in
unexpanded types/expressions, such as `PackIndexingType` or
`PackIndexingExpr`. To handle these cases correctly, we should defer the
pack size checks to the next round of transformation, when the patterns
can be fully expanded.

To that end, the `FullySubstituted` flag is now necessary for computing
the dependencies of `PackIndexingExprs`. Conveniently, this flag can
also represent the prior `ExpandsToEmpty` status with an additional
emptiness check. Therefore, I converted all stored flags to use
`FullySubstituted`.

Fixes https://github.com/llvm/llvm-project/issues/116105
2024-11-25 16:16:39 +08:00
Christian Sigg
2585b6e8fa [mlir][bazel] Fix layering check failure. 2024-11-25 09:10:14 +01:00
Christian Sigg
b0bdbf4288 [mlir][bazel] Port 7498eaa9ab 2024-11-25 08:31:16 +01:00
Phoebe Wang
2568e52a73
[X86,SimplifyCFG] Support hoisting load/store with conditional faulting (Part II) (#108812)
This is a follow up of #96878 to support hoisting load/store from BBs
have the same predecessor, if load/store are the only instructions and
the branch is unpredictable, e.g.:

```
void test (int a, int *c, int *d) {
  if (a)
   *c = a;
  else
   *d = a;
}
```
2024-11-25 15:19:28 +08:00
Owen Pan
b9731a479c [clang-format][doc] Minor cleanup 2024-11-24 23:15:39 -08:00
LiqinWeng
73bebf96bc
[LangRef] Update the position of some parameters in the vp intrinsic of abs/cttz/ctlz (#117519) 2024-11-25 14:47:50 +08:00
Kazu Hirata
ff7b42c194
[memprof] Speed up llvm-profdata (#117446)
CallStackRadixTreeBuilder::build takes the parameter
MemProfFrameIndexes by value, involving copies:

  std::optional<const llvm::DenseMap<FrameIdTy, LinearFrameId>>
    MemProfFrameIndexes

Then "build" makes another copy of MemProfFrameIndexe and passes it to
encodeCallStack for every call stack, which is painfully slow.

This patch changes the type to a pointer so that we don't have to make
a copy every time we pass the argument.

Without this patch, it takes 553 seconds to run "llvm-profdata merge"
on a large MemProf raw profile.  This patch shortenes that down to 67
seconds.
2024-11-24 21:08:54 -08:00
Kazu Hirata
9e3215ac16
[memprof] Add an assert to InstrProfWriter::addMemProfData (#117426)
This patch adds a quick validity check to
InstrProfWriter::addMemProfData.  Specifically, we check to see if we
have all (or none) of the MemProf profile components (frames, call
stacks, records).

The credit goes to Teresa Johnson for suggesting this assert.
2024-11-24 21:07:59 -08:00
Piyou Chen
87cc4b48c0 [NFC] Fix buildbot fail by add riscv64-registered-target 2024-11-24 20:54:40 -08:00
Piyou Chen
7317a6e990
[RISCV][MachineVerifier] Use RegUnit for register liveness checking (#115980)
For the RISC-V target, V14_V15 are not subregisters of v14m4, even
though they share some registers. Currently, the MachineVerifier reports
an error when checking register liveness for segment load/store
operations.

This patch adds additional register liveness checking, using RegUnit
instead of subregisters, to prevent this error.
2024-11-25 12:43:39 +08:00
hev
2523439021
[LoongArch] Add a test case for inline compatibility checks (#117144) 2024-11-25 12:34:46 +08:00
Craig Topper
3fb0bea859 [RISCV][GISel] Add register class to some isel output patterns so they can be imported.
This makes (fcopysign X, (fneg Y)) patterns work.
2024-11-24 19:29:52 -08:00
hev
e26af0938c
[llvm] Add BasicTTIImpl::areInlineCompatible for target feature subset checks (#117493)
This patch moves the `areInlineCompatible` implementation from multiple
subclasses (`AArch64TTIImpl`, `RISCVTTIImpl`, `WebAssemblyTTIImpl`) to
the base class `BasicTTIImpl`. The new implementation checks whether the
callee's target features are a subset of the caller's, enabling
consistent behavior across targets. Subclasses now simply delegate to
the base implementation, reducing code duplication and improving
maintainability.
2024-11-25 11:22:49 +08:00
Dave Lee
0bfc951471
[lldb] Remove lldbutil.get_stack_frames (NFC) (#117505)
`SBThread.frames` can be used instead of `get_stack_frames`.
2024-11-24 19:02:47 -08:00
Matthias Springer
345ca6a692
[mlir][Transforms] Dialect conversion: extra signature conversion check (#117471)
This commit adds an extra assertion to `applySignatureConversion` to
prevent incorrect API usage: The same block cannot be converted multiple
times. That would mess with the underlying conversion value mapping.
(Mappings would be overwritten.) This is similar to op replacements: The
same op cannot be replaced multiple times.

To simplify the check, `BlockTypeConversionRewrite::block` now stores
the original block. The new block is stored in an extra field. (It used
to be the other way around.)

This commit is in preparation of adding 1:N support to the conversion
value mapping. Before making any further changes to the mapping
infrastructure, I'd like to make sure that the code base around it (that
uses the mapping) is robust.
2024-11-25 11:33:38 +09:00
Craig Topper
bb5bbe523d [RISCV][GISel] Support s32/s64 G_FSUB/FDIV/FNEG without F/D extensions.
Use libcalls for G_FSUB/FDIV. Use integer operations for G_FNEG.

Copy most of the IR tests for arithmetic from SelectionDAG.
2024-11-24 18:22:12 -08:00
Sergei Barannikov
5f3eab9e45
[AVR] Remove extra ROL / ROR operands (#117510)
The nodes have one input, shift amount of 1 is implied.
2024-11-25 05:15:20 +03:00
LiqinWeng
02408d6b28
[VP] Refactoring some functions in ExpandVectorPredication.NFC (#115840)
Building vp intrinsic functions using a unified interface for
expandPredicationToIntCall/expandPredicationToFPCall/expandPredicationToCastIntrinsic
functions.
2024-11-25 10:05:29 +08:00
Weining Lu
e70f9e2096 [LoongArch] Remove the added in #116762 2024-11-25 09:33:55 +08:00
Fangrui Song
6aeffa18e9 [ELF] --reproduce: strip directories for --dependency-file=
CMake may generate build.ninja with DEP_FILE specifying a non-existent
directory in the reproduce tarball.
2024-11-24 17:23:52 -08:00
Wu Yingcong
5ed09d552d
[Support] Check zstd decompress result before msan unpoison (#117276)
We should check the zstd decompress result before doing the msan
unpoison. If the res is abnormal, then it would be a huge number, which
will cause undesired msan unpoison behavior and will run for a long
time.
2024-11-25 08:59:17 +08:00