492304 Commits

Author SHA1 Message Date
Michael Maitland
034cc2f5d0
[GISEL] Add G_INSERT_SUBVECTOR and G_EXTRACT_SUBVECTOR (#84538)
G_INSERT and G_EXTRACT are not sufficient to use to represent both
INSERT/EXTRACT on a subregister and INSERT/EXTRACT on a vector.

We would like to be able to INSERT/EXTRACT on vectors in cases that
INSERT/EXTRACT on vector subregisters are not sufficient, so we add
these opcodes.

I tried to do a patch where we treated G_EXTRACT as both
G_EXTRACT_SUBVECTOR and G_EXTRACT_SUBREG, but ran into an infinite loop
at this
[point](8b5b294ec2/llvm/lib/Target/RISCV/RISCVISelLowering.cpp (L9932))
in the SDAG equivalent code.
2024-03-11 13:47:30 -04:00
Bhuminjay Soni
8467457afc
Add new flag -Wreturn-mismatch (#82872)
This pull request fixes #72116 where a new flag is introduced for
compatibility with GCC 14, the functionality of -Wreturn-type is
modified to split some of its behaviors into -Wreturn-mismatch

Fixes #72116
2024-03-11 13:25:32 -04:00
annamthomas
866ac9a165
[LV] Address postcommit review for PR84782 (#84797)
This testcase was added to show miscompile in
https://github.com/llvm/llvm-project/issues/81872
2024-03-11 13:23:00 -04:00
Benjamin Kramer
36a2752923 [bazel] Grab correct version info after 81e20472a0c5a4a8edc5ec38dc345d580681af81
This is a bit awkward.
2024-03-11 18:21:42 +01:00
Jason Molenda
bdbad0d07b
Turn off instruction flow control annotations by default (#84607)
Walter Erquinigo added optional instruction annotations for x86
instructions in 2022 for the `thread trace dump instruction` command,
and code to DisassemblerLLVMC to add annotations for instructions that
change flow control, v. https://reviews.llvm.org/D128477

This was added as an option to `disassemble`, and the trace dump command
enables it by default, but several other instruction dumpers were
changed to display them by default as well. These are only implemented
for Intel instructions, so our disassembly on other targets ends up
looking like

```
(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc   unknown     stp    x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd   unknown     stp    x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd   unknown     add    x29, sp, #0x10
0x1000086f0: 0xd11843ff   unknown     sub    sp, sp, #0x610
0x1000086f4: 0x910c63e8   unknown     add    x8, sp, #0x318
```

instead of `disassemble`'s output style of

```
lldb`main:
lldb[0x1000086e4] <+0>:  stp    x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>:  stp    x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>:  add    x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub    sp, sp, #0x610
lldb[0x1000086f4] <+16>: add    x8, sp, #0x318
```

Adding symbolic annotations for assembly instructions is something I'm
interested in too, because we may have users investigating a crash or
apparent-incorrect behavior who must debug optimized assembly and they
may not be familiar with the ISA they're using, so short of flipping
through a many-thousand-page PDF to understand each instruction, they're
lost. They don't write assembly or work at that level, but to understand
a bug, they have to understand what the instructions are actually doing.

But the annotations that exist today don't move us forward much on that
front - I'd argue that the flow control instructions on Intel are not
hard to understand from their names, but that might just be my personal
bias. Much trickier instructions exist in any event.

Displaying this information by default for all targets when we only have
one class of instructions on one target is not a good default.

Also, in 2011 when Greg implemented the `memory read -f i` (aka `x/i`)
command
```
commit 5009f9d5010a7e34ae15f962dac8505ea11a8716
Author: Greg Clayton <gclayton@apple.com>
Date:   Thu Oct 27 17:55:14 2011 +0000
[...]
    eFormatInstruction will print out disassembly with bytes and it will use the
    current target's architecture. The format character for this is "i" (which
    used to be being used for the integer format, but the integer format also has
    "d", so we gave the "i" format to disassembly), the long format is
    "instruction".
```

he had DumpDataExtractor's DumpInstructions print the bytes of the
instruction -- that's the first field we see above for the `x/5i` after
the address -- and this is only useful for people who are debugging the
disassembler itself, I would argue. I don't want this displayed by
default either.

tl;dr this patch removes both fields from `memory read -f -i` and I
think this is the right call today. While I'm really interested in
instruction annotation, I don't think `x/i` is the right place to have
it enabled by default unless it's really compelling on at least some of
our major targets.
2024-03-11 10:21:07 -07:00
Guillaume Chatelet
07d7b9c255
[libc] Fix forward arm32 builtbot (#84794)
Introduced by https://github.com/llvm/llvm-project/pull/83441.
2024-03-11 18:17:18 +01:00
karzan
501bc101c0
[lldb] Save the edited line before clearing it in Editline::PrintAsync (#84154)
If the `m_editor_status` is `EditorStatus::Editing`, PrintAsync clears
the currently edited line. In some situations, the edited line is not
saved. After the stream flushes, PrintAsync tries to display the unsaved
line, causing the loss of the edited line.

The issue arose while I was debugging REPRLRun in
[Fuzzilli](https://github.com/googleprojectzero/fuzzilli). I started
LLDB and attempted to set a breakpoint in libreprl-posix.c. I entered
`breakpoint set -f lib` and used the "tab" key for command completion.
After completion, the edited line was flushed, leaving a blank line.
2024-03-11 10:07:12 -07:00
Mark de Wever
9a9aa41dea
[LLDB][doc] Updates build instructions. (#84630)
Recently building libc++ requires building libunwind too. This updates
the LLDB instructions.

I noticed this recently and it was separately filed as
https://github.com/llvm/llvm-project/issues/84053
2024-03-11 17:45:48 +01:00
Mark de Wever
81e20472a0
[cmake] Exposes LLVM version number in the runtimes. (#84641)
This allows sharing the LLVM version number in libc++.
2024-03-11 17:43:14 +01:00
Simon Pilgrim
6cd68c2f87 [X86] Add base SSE2 coverage to SRL/SRA combines tests 2024-03-11 16:25:05 +00:00
Simon Pilgrim
7dc4d5f6a0 [X86] Add AVX512 (x86-64-v4) coverage to generic shift combines tests 2024-03-11 16:22:47 +00:00
annamthomas
34acdb3ec2
Precommit testcase for pr81872 (#84782)
Testcase shows miscompile when dropping disjoint flag from disjoint or
during vectorization.
2024-03-11 12:16:52 -04:00
Cyndy Ishida
2c93beccdf
[InstallAPI] Collect C++ Decls (#84403)
This includes capturing symbols for global variables, functions,
classes, and templated defintions. As pre-determing what symbols are
generated from C++ declarations can be non-trivial, InstallAPI only
parses select declarations for symbol generation when parsing c++.

For example, installapi only looks at explicit template instantiations
or full template specializations, instead of general function or class
templates, for symbol emittion.
2024-03-11 09:02:43 -07:00
Simon Pilgrim
ad8c828136 [X86] (V)MPSADBW instructions can run on Port1 or Port5 for one uop stage
When we copied the IceLake model from the SkylakeServer model we missed this diff

Confirmed with uops.info and Agner
2024-03-11 15:48:08 +00:00
Simon Pilgrim
0858c906db [X86] Add missing register qualifier to the VBLENDVPD/VBLENDVPS/VPBLENDVB instruction names
Matches the SSE variants (which has a 0 qualifier to indicate the xmm0 explicit dependency)
2024-03-11 15:48:07 +00:00
Marius Brehler
a924da6d4b
[mlir][IR] Add isInteger() (without width) (#84467)
For the singless and signed integers overloads exist, so that the width
does not need to be specified as an argument. This adds the same for
integers without checking for signedness.
2024-03-11 08:47:06 -07:00
Jay Foad
575ca6744b [CodeGen] Remove unused MachineRegisterInfo methods 2024-03-11 15:42:24 +00:00
Matthias Gehre
818af71b72
[mlir][emitc] Add ArrayType (#83386)
This models a one or multi-dimensional C/C++ array.

The type implements the `ShapedTypeInterface` and prints similar to
memref/tensor:
```
  %arg0: !emitc.array<1xf32>,
  %arg1: !emitc.array<10x20x30xi32>,
  %arg2: !emitc.array<30x!emitc.ptr<i32>>,
  %arg3: !emitc.array<30x!emitc.opaque<"int">>
```

It can be translated to a C array type when used as function parameter
or as `emitc.variable` type.
2024-03-11 16:40:57 +01:00
lntue
d99bb01422
[libc][NFC] Clean up test/src/math/differential_testing folder, renaming it to performance_testing. (#84646)
Removing all the diff tests.
2024-03-11 11:38:39 -04:00
Jay Foad
63a5dc4aed
[CodeGen] Do not pass MF into MachineRegisterInfo methods. NFC. (#84770)
MachineRegisterInfo already knows the MF so there is no need to pass it
in as an argument.
2024-03-11 15:35:05 +00:00
Krzysztof Parzyszek
cd5504637b
[flang][unittests] Use malloc when memory will be deallcated with free (#84380)
Runtime unit tests used `new[]` to allocate memory, which then was
released using `free`.

This was detected by address sanitizer.
2024-03-11 10:29:46 -05:00
Jonathan Peyton
9b1c496898
[OpenMP] Fixup while loops to avoid bad NULL check (#83302) 2024-03-11 10:28:12 -05:00
Jonathan Peyton
de4d7015d0
[OpenMP] Remove unnecessary check of ap (#83303) 2024-03-11 10:27:53 -05:00
Jonathan Peyton
1ed463d961
[OpenMP] Make sure ptr is used after NULL check (#83304) 2024-03-11 10:27:31 -05:00
Jonathan Peyton
b4e39ad117
[OpenMP] Remove dead code of checking int > INT_MAX (#83305) 2024-03-11 10:26:53 -05:00
Krzysztof Drewniak
b05c15259b
[mlir][AMDGPU] Improve amdgpu.lds_barrier, add warnings (#77942)
On some architectures (currently gfx90a, gfx94*, and gfx10**), we can
implement an LDS barrier using compiler intrinsics instead of inline
assembly, improving optimization possibilities and decreasing the
fragility of the underlying code.

Other AMDGPU chipsets continue to require inline assembly to implement
this barrier, as, by the default, the LLVM backend will insert waits on
global memory (s_waintcnt vmcnt(0)) before barriers in order to ensure
memory watchpoints set by debuggers work correctly.

Use of amdgpu.lds_barrier, on these architectures, imposes a tradeoff
between debugability and performance. The documentation, as well as the
generated inline assembly, have been updated to explicitly call
attention to this fact.

For chipsets that did not require the inline assembly hack, we move to
the s.waitcnt and s.barrier intrinsics, which have been added to the
ROCDL dialect. The magic constants used as an argument to the waitcnt
intrinsic can be derived from
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
2024-03-11 10:06:49 -05:00
itrofimow
63af8584fc
[libc++] Only forward-declare ABI-functions in exception_ptr.h if they are meant to be used (#84707)
This patch fixes the unconditional forward-declarations of ABI-functions
in exception_ptr.h, and makes it dependent on the availability macro, as
it should've been from the beginning.

The declarations being unconditional break the build with libcxxrt
before 045c52ce8 [1], now they are opt-out.

[1]: 045c52ce82
2024-03-11 10:59:05 -04:00
Pierre van Houtryve
63c77d8475
[AMDGPU] Make generic versioning docs easier to find (#84761) 2024-03-11 15:56:17 +01:00
A. Jiang
63ae5099b7
[libc++][test] Don't include test_format_context.h in parse.pass.cpp (#83734)
The `parse.pass.cpp` tests doen't need to call
`test_format_context_create` to create a `basic_format_context`, so they
shouldn't include `test_format_context.h`.

The `to_address` mechanism works around the iterator debugging
mechanisms of MSVC STL. Related to
[LWG3989](https://cplusplus.github.io/LWG/issue3989).

Discovered when implementing `formatter<tuple>` in MSVC STL. With the
inclusion removed, `std/utilities/format/format.tuple/parse.pass.cpp`
when using enhanced MSVC STL (and `/utf-8` option for MSVC).
2024-03-11 10:55:16 -04:00
Philip Reames
f14224d92b
[RISCV] Rename schedule classes for vmv.s.x, vmv.x.s, vfmv.s.f, and vfmv.f.s [nfc] (#84563)
The prior naming scheme is incredibly hard to make sense out of. I
suspect the usage was actually backwards from intent - though that
didn't matter for any in tree schedule model.
2024-03-11 07:52:28 -07:00
Orlando Cazalet-Hyams
2953d9c8b0 Reapply "[RemoveDIs] Add additional debug-mode verifier checks" (#84757)
Test failures fixed in d0117b71193787ebfd92d96a4ecc261f0aaeac86
2024-03-10 20:54:40 +00:00
Emma Pilkington
538aeb180b
[AMDGPU] Use a consistent DwarfEH register flavour (#84513)
Previously, we always used the wave64 encodings for EH registers
regardless of whether we were compiling for wave32, which seems wrong.
We don't seem to use the EH registers, so this commit is mostly just
about papering over code that converts from non-EH dwarf registers to
LLVM registers while claiming they are EH dwarf registers. That kind of
code should be okay on any non-darwin target (since darwin is the only
target that uses a different encoding for EH registers).
2024-03-11 10:36:38 -04:00
Orlando Cazalet-Hyams
d0117b7119 [RemoveDIs] Copy debug mode to new functions in amdgpu-lower-buffer-fat-pointers
Fixes failing tests after https://github.com/llvm/llvm-project/pull/84308

LLVM :: CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces-vectors.ll
LLVM :: CodeGen/AMDGPU/GlobalISel/irtranslator-non-integral-address-spaces.ll
LLVM :: CodeGen/AMDGPU/lower-buffer-fat-pointers-calls.ll
LLVM :: CodeGen/AMDGPU/lower-buffer-fat-pointers-constants.ll
LLVM :: CodeGen/AMDGPU/lower-buffer-fat-pointers-pointer-ops.ll
LLVM :: CodeGen/AMDGPU/pal-metadata-3.0.ll

Buildbots: https://lab.llvm.org/buildbot/#/builders/121/builds/39855
2024-03-10 20:51:39 +00:00
Krzysztof Drewniak
769eab4719
[NFC][AMDGPU] Fix redundant assignment from #77952 (#84586)
Someone pointed out a typo (Value* RsrcRes = RsrcRes = ...) in PR the
address space 7 lowering, this commit fixes it.
2024-03-11 09:34:50 -05:00
Sivan Shani
5e688f0dbd [llvm][arm] add T1 and T2 assembly options for vlldm and vlstm
Re-land 634b0243b8f7acc85af4f16b70e91d86ded4dc83.

T1 allow for an optional registers list,
the register list must be {d0-d15}.
T2 define a mandatory register list,
the register list must be {d0-d31}.

The requirements for T1/T2 are as follows:
                T1              T2
Require:        v8-M.Main,      v8.1-M.Main,
                secure state    secure state
16 D Regs       valid           valid
32 D Regs       UNDEFINED       valid
No D Regs       NOP             NOP
2024-03-11 14:27:28 +00:00
Joseph Huber
9bc294f9be
[libc] Build the GPU during the projects setup like libc-hdrgen (#84667)
Summary:
The libc build has a few utilties that need to be built before we can do
everything in the full build. The one requirement currently is the
`libc-hdrgen` binary. If we are doing a full build runtimes mode we
first add `libc` to the projects list and then only use the `projects`
portion to buld the `libc` portion. We also use utilities for the GPU
build, namely the loader utilities. Previously we would build these
tools on-demand inside of the cross-build, which tool some hacky
workarounds for the dependency finding and target triple. This patch
instead just builds them similarly to libc-hdrgen and then passses them
in. We now either pass it manually it it was built, or just look it up
like we do with the other `clang` tools.

Depends on https://github.com/llvm/llvm-project/pull/84664
2024-03-11 09:18:47 -05:00
elhewaty
3f302eaca4
[InstCombine] Fold usub_sat((sub nuw C1, A), C2) to usub_sat(C1 - C2, A) or 0 (#82280)
- Fixes: https://github.com/llvm/llvm-project/issues/82177
- Alive2: https://alive2.llvm.org/ce/z/Q7mMC3
2024-03-11 15:10:40 +01:00
Mark Zhuang
b1be69f4db
[NFC] Remove duplicate 'see' in CMake.rst (#84680) 2024-03-11 15:06:30 +01:00
Joseph Huber
d27c1bed11
[libc] Only enable LLVM_FULL_BUILD_MODE by default for GPU targets (#84664)
Summary:
Currently we have a conditional that turns the full build on by default
if it is a default target. This used to work fine when the GPU was the
only target that was ever present. However, we've recently changed to
allow building multiple of these at the same time. That means we should
have the ability to build overlay mode in the CPU mode and full build in
the GPU mode. This patch makes some simple adjustments to pass the
arguments per-triple. This slightly extends the existing `-DRUNTIMES_`
argument support to also transform any extra CMake inputs rather than
just the passed CMake variables.
2024-03-11 09:05:49 -05:00
Louis Dionne
02e0b7d405
[libc++] Add missing include in test (#84579)
That test is using std::toupper.
2024-03-11 09:51:59 -04:00
Simon Pilgrim
1ec5b1f483 [X86] Add missing immediate qualifier to the (V)PCLMULQDQ instruction names 2024-03-11 13:39:25 +00:00
Zain Jaffal
0fae9a24b9
[Docs] Fix llvm-remarkutil docs (#84661)
Code blocks and option points weren't rendered correctly
2024-03-11 13:36:52 +00:00
LLVM GN Syncbot
fcd0dd3793 [gn build] Port 2a38551457cb 2024-03-11 13:21:12 +00:00
Joseph Huber
9d30f11b88 [libc] Remove use of __builtin_modf in GPU math
Summary:
This function was not actually supported, see
https://godbolt.org/z/MP1j5EeWc. Unsure why we only now begun seeing
failures related to it.
2024-03-11 08:20:06 -05:00
David CARLIER
facb89ae12
[openmp] __kmp_x86_cpuid fix for i386/PIC builds. (#84626) 2024-03-11 13:15:43 +00:00
Jason Eckhardt
6f7e940c2d
[TableGen] More efficiency improvements for encode/decode emission. (#84647)
DecoderEmitter and CodeEmitterGen perform repeated linear walks over the
entire instruction list. This patch eliminates two more such walks.

The eliminated traversals visit every instruction merely to determine
whether the target has variable length encodings. For a target with
variable length encodings, the original any_of will terminate quickly.
But all targets other than M68k use fixed length encodings and thus
any_of must visit the entire instruction list.
2024-03-11 08:13:33 -05:00
Orlando Cazalet-Hyams
2f1873d11e
Revert "[RemoveDIs] Add additional debug-mode verifier checks" (#84757)
Reverts llvm/llvm-project#84308

Failing bots, e.g.
https://lab.llvm.org/buildbot/#/builders/16/builds/62432
2024-03-11 13:09:55 +00:00
Nikolas Klauser
2a38551457
[libc++] Remove <tuple> from <variant> (#83183)
This moves a utility from `<tuple>` into an implementation detail header
and refactors the selection of the variant index type to use.
2024-03-11 14:04:51 +01:00
Simon Pilgrim
2b8f1daf78 [X86] Add missing immediate qualifier to the SSE42 (V)PCMPEST/PCMPIST string instruction names 2024-03-11 13:02:48 +00:00
Krzysztof Parzyszek
546f32df26
[flang][CodeGen] Fix use-after-free in BoxedProcedurePass (#84376)
Avoid inspecting an operation that has been replaced.

This was detected by address sanitizer.
2024-03-11 08:00:18 -05:00