G_INSERT and G_EXTRACT are not sufficient to use to represent both
INSERT/EXTRACT on a subregister and INSERT/EXTRACT on a vector.
We would like to be able to INSERT/EXTRACT on vectors in cases that
INSERT/EXTRACT on vector subregisters are not sufficient, so we add
these opcodes.
I tried to do a patch where we treated G_EXTRACT as both
G_EXTRACT_SUBVECTOR and G_EXTRACT_SUBREG, but ran into an infinite loop
at this
[point](8b5b294ec2/llvm/lib/Target/RISCV/RISCVISelLowering.cpp (L9932))
in the SDAG equivalent code.
This pull request fixes#72116 where a new flag is introduced for
compatibility with GCC 14, the functionality of -Wreturn-type is
modified to split some of its behaviors into -Wreturn-mismatch
Fixes#72116
Walter Erquinigo added optional instruction annotations for x86
instructions in 2022 for the `thread trace dump instruction` command,
and code to DisassemblerLLVMC to add annotations for instructions that
change flow control, v. https://reviews.llvm.org/D128477
This was added as an option to `disassemble`, and the trace dump command
enables it by default, but several other instruction dumpers were
changed to display them by default as well. These are only implemented
for Intel instructions, so our disassembly on other targets ends up
looking like
```
(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc unknown stp x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd unknown stp x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd unknown add x29, sp, #0x10
0x1000086f0: 0xd11843ff unknown sub sp, sp, #0x610
0x1000086f4: 0x910c63e8 unknown add x8, sp, #0x318
```
instead of `disassemble`'s output style of
```
lldb`main:
lldb[0x1000086e4] <+0>: stp x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>: stp x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>: add x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub sp, sp, #0x610
lldb[0x1000086f4] <+16>: add x8, sp, #0x318
```
Adding symbolic annotations for assembly instructions is something I'm
interested in too, because we may have users investigating a crash or
apparent-incorrect behavior who must debug optimized assembly and they
may not be familiar with the ISA they're using, so short of flipping
through a many-thousand-page PDF to understand each instruction, they're
lost. They don't write assembly or work at that level, but to understand
a bug, they have to understand what the instructions are actually doing.
But the annotations that exist today don't move us forward much on that
front - I'd argue that the flow control instructions on Intel are not
hard to understand from their names, but that might just be my personal
bias. Much trickier instructions exist in any event.
Displaying this information by default for all targets when we only have
one class of instructions on one target is not a good default.
Also, in 2011 when Greg implemented the `memory read -f i` (aka `x/i`)
command
```
commit 5009f9d5010a7e34ae15f962dac8505ea11a8716
Author: Greg Clayton <gclayton@apple.com>
Date: Thu Oct 27 17:55:14 2011 +0000
[...]
eFormatInstruction will print out disassembly with bytes and it will use the
current target's architecture. The format character for this is "i" (which
used to be being used for the integer format, but the integer format also has
"d", so we gave the "i" format to disassembly), the long format is
"instruction".
```
he had DumpDataExtractor's DumpInstructions print the bytes of the
instruction -- that's the first field we see above for the `x/5i` after
the address -- and this is only useful for people who are debugging the
disassembler itself, I would argue. I don't want this displayed by
default either.
tl;dr this patch removes both fields from `memory read -f -i` and I
think this is the right call today. While I'm really interested in
instruction annotation, I don't think `x/i` is the right place to have
it enabled by default unless it's really compelling on at least some of
our major targets.
If the `m_editor_status` is `EditorStatus::Editing`, PrintAsync clears
the currently edited line. In some situations, the edited line is not
saved. After the stream flushes, PrintAsync tries to display the unsaved
line, causing the loss of the edited line.
The issue arose while I was debugging REPRLRun in
[Fuzzilli](https://github.com/googleprojectzero/fuzzilli). I started
LLDB and attempted to set a breakpoint in libreprl-posix.c. I entered
`breakpoint set -f lib` and used the "tab" key for command completion.
After completion, the edited line was flushed, leaving a blank line.
Recently building libc++ requires building libunwind too. This updates
the LLDB instructions.
I noticed this recently and it was separately filed as
https://github.com/llvm/llvm-project/issues/84053
This includes capturing symbols for global variables, functions,
classes, and templated defintions. As pre-determing what symbols are
generated from C++ declarations can be non-trivial, InstallAPI only
parses select declarations for symbol generation when parsing c++.
For example, installapi only looks at explicit template instantiations
or full template specializations, instead of general function or class
templates, for symbol emittion.
For the singless and signed integers overloads exist, so that the width
does not need to be specified as an argument. This adds the same for
integers without checking for signedness.
This models a one or multi-dimensional C/C++ array.
The type implements the `ShapedTypeInterface` and prints similar to
memref/tensor:
```
%arg0: !emitc.array<1xf32>,
%arg1: !emitc.array<10x20x30xi32>,
%arg2: !emitc.array<30x!emitc.ptr<i32>>,
%arg3: !emitc.array<30x!emitc.opaque<"int">>
```
It can be translated to a C array type when used as function parameter
or as `emitc.variable` type.
On some architectures (currently gfx90a, gfx94*, and gfx10**), we can
implement an LDS barrier using compiler intrinsics instead of inline
assembly, improving optimization possibilities and decreasing the
fragility of the underlying code.
Other AMDGPU chipsets continue to require inline assembly to implement
this barrier, as, by the default, the LLVM backend will insert waits on
global memory (s_waintcnt vmcnt(0)) before barriers in order to ensure
memory watchpoints set by debuggers work correctly.
Use of amdgpu.lds_barrier, on these architectures, imposes a tradeoff
between debugability and performance. The documentation, as well as the
generated inline assembly, have been updated to explicitly call
attention to this fact.
For chipsets that did not require the inline assembly hack, we move to
the s.waitcnt and s.barrier intrinsics, which have been added to the
ROCDL dialect. The magic constants used as an argument to the waitcnt
intrinsic can be derived from
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp
This patch fixes the unconditional forward-declarations of ABI-functions
in exception_ptr.h, and makes it dependent on the availability macro, as
it should've been from the beginning.
The declarations being unconditional break the build with libcxxrt
before 045c52ce8 [1], now they are opt-out.
[1]: 045c52ce82
The `parse.pass.cpp` tests doen't need to call
`test_format_context_create` to create a `basic_format_context`, so they
shouldn't include `test_format_context.h`.
The `to_address` mechanism works around the iterator debugging
mechanisms of MSVC STL. Related to
[LWG3989](https://cplusplus.github.io/LWG/issue3989).
Discovered when implementing `formatter<tuple>` in MSVC STL. With the
inclusion removed, `std/utilities/format/format.tuple/parse.pass.cpp`
when using enhanced MSVC STL (and `/utf-8` option for MSVC).
The prior naming scheme is incredibly hard to make sense out of. I
suspect the usage was actually backwards from intent - though that
didn't matter for any in tree schedule model.
Previously, we always used the wave64 encodings for EH registers
regardless of whether we were compiling for wave32, which seems wrong.
We don't seem to use the EH registers, so this commit is mostly just
about papering over code that converts from non-EH dwarf registers to
LLVM registers while claiming they are EH dwarf registers. That kind of
code should be okay on any non-darwin target (since darwin is the only
target that uses a different encoding for EH registers).
Re-land 634b0243b8f7acc85af4f16b70e91d86ded4dc83.
T1 allow for an optional registers list,
the register list must be {d0-d15}.
T2 define a mandatory register list,
the register list must be {d0-d31}.
The requirements for T1/T2 are as follows:
T1 T2
Require: v8-M.Main, v8.1-M.Main,
secure state secure state
16 D Regs valid valid
32 D Regs UNDEFINED valid
No D Regs NOP NOP
Summary:
The libc build has a few utilties that need to be built before we can do
everything in the full build. The one requirement currently is the
`libc-hdrgen` binary. If we are doing a full build runtimes mode we
first add `libc` to the projects list and then only use the `projects`
portion to buld the `libc` portion. We also use utilities for the GPU
build, namely the loader utilities. Previously we would build these
tools on-demand inside of the cross-build, which tool some hacky
workarounds for the dependency finding and target triple. This patch
instead just builds them similarly to libc-hdrgen and then passses them
in. We now either pass it manually it it was built, or just look it up
like we do with the other `clang` tools.
Depends on https://github.com/llvm/llvm-project/pull/84664
Summary:
Currently we have a conditional that turns the full build on by default
if it is a default target. This used to work fine when the GPU was the
only target that was ever present. However, we've recently changed to
allow building multiple of these at the same time. That means we should
have the ability to build overlay mode in the CPU mode and full build in
the GPU mode. This patch makes some simple adjustments to pass the
arguments per-triple. This slightly extends the existing `-DRUNTIMES_`
argument support to also transform any extra CMake inputs rather than
just the passed CMake variables.
DecoderEmitter and CodeEmitterGen perform repeated linear walks over the
entire instruction list. This patch eliminates two more such walks.
The eliminated traversals visit every instruction merely to determine
whether the target has variable length encodings. For a target with
variable length encodings, the original any_of will terminate quickly.
But all targets other than M68k use fixed length encodings and thus
any_of must visit the entire instruction list.