Re-land D145441 with data layout upgrade code fixed to not break OpenMP.
This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.
Differential Revision: https://reviews.llvm.org/D149776
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.
The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.
The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.
Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.
This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.
Depends on D143437
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D145441
This reverts commit c117c2c8ba4afd45a006043ec6dd858652b2ffcc.
itaniumDemangle calls std::strlen with the results of
std::string_view::data() which may not be NUL-terminated. This causes
lld/test/wasm/why-extract.s to fail when "expensive checks" are enabled
via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON. See D149675 for further
discussion. Back this out until the individual demanglers are converted
to use std::string_view.
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549
There's potential for a lot more cleanups around these APIs. This is
just a start.
Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using ``llvm::demangle``. Add a
release note.
Reviewed By: MaskRay, #lld-macho
Differential Revision: https://reviews.llvm.org/D149104
Since we don't always need the vendor extension to be in riscv_vector.td,
so it's better to make it be in separated header.
Depends on D148223 and D148680
Differential Revision: https://reviews.llvm.org/D148308
Similar for RISCV::parseTuneCPU and RISCV::checkTuneCPUKind.
This makes the CPUKind enum no longer part of the API. It wasn't
providing much value. It was only used to pass between the two
functions.
By removing it, we can remove a dependency on a tablegen generated
file from the RISCVTargetParser.h file. Then we can remove a
dependency from several CMakeLists.txt.
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.
There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.
Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.
Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.
I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.
If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.
This could also be of use for strictfp handling, where code may be
changing the denormal mode.
Alternative name could be "unknown".
Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
Clang accepts preserve_all for AArch64 while it is missing form the backed.
Fixes#58145
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D135652
This patch changes the shufflevector's semantics to yield poison if the mask is undefined.
This allows the extraction of shufflevectors while also opening the door for more
optimization opportunities due to the fact that poison is more undefined than undef.
Differential Revision: https://reviews.llvm.org/D148637
These are from the N extension (User-Level Interrupts) which did
not make it into 1.12 of the Privileged Specification.
D117653 also tried to remove some of these, but it was never reviewed.
Reviewed By: jrtc27
Differential Revision: https://reviews.llvm.org/D149278
This enables indexing in `!foreach` and permutation with `list[permlist]`.
Enhancements in syntax:
- `list<int>` is applicable as a slice element.
- `list[int,]` is evaluated as not `ElemType` but `list<ElemType>`
with a single element.
Part of D145872
FIXME: I didn't apply new semantics to BitSlice.
This commit simplifies the text of DW_OP_LLVM_entry_value by making it
terser, replacing a verbose example with a more concrete one, providing
an explicit conclusion on the meaning of N=1, and by transforming the
description of which passes generate this op into a list (which enables
future expansion of this list).
Differential Revision: https://reviews.llvm.org/D149177
There was a bunch of references to bugzilla and other old
instructions in there. I have updated it to match the current
reality.
Reviewed By: tstellar
Differential Revision: https://reviews.llvm.org/D149131
This patch modifies the commands under the Bootstrap builds section so
that they include the -DLLVM_ENABLE_PROJECTS flag with the value "clang"
so that the Clang build will actually get picked up. Without this patch
they error out as the CLANG_BOOTSTRAP_PASSTHROUGH variable is processed
in /clang/CMakeLists.txt which isn't picked up without the
LLVM_ENABLE_PROJECTS variable being set appropriately.
This patch also changes any remaining dangling <path to source>
references to <path to source>/llvm to better match the rest of the
file.
Reviewed By: thieta
Differential Revision: https://reviews.llvm.org/D148451
* Move step 8 to later, after worker credentials have
been added to the buildmaster.
* Added command for starting the worker, in addition
to creating the worker. The latter only sets up the
directories.
* Noted that in step 6, it is expected that you get a
refused connection.
* Stated that the connection should be tried once,
and the worker then stopped.
We could mention that repeated connections with invalid
credentials will result in an IP ban, but it's probably
detail people don't need here.
If it did happen, then you would not know until you tried
the later steps. At which point you are already in contact
with Galina, who is the person who would help you with that
issue in any case.
Reviewed By: gkistanova
Differential Revision: https://reviews.llvm.org/D148913
Update the Zvk support from 0.3.x to 0.5.1, tracking the extension as
documented in
<https://github.com/riscv/riscv-crypto/releases/download/v20230407/riscv-crypto-spec-vector.pdf>.
- Zvkb is split into Zvbb and Zvbc
- Zvbc (vector carryless multiply) requires 64 bit elements (Zve64x)
- Use the extension descriptions from the specification for Zvbb/Zvbc
- Zvkt is introduced (no instructions, but adds an attribute and macro)
- Zvkn and Zvks both imply Zvkt
- Zvkng and Zvksg are introduced, adding Zvkg (GMAC) to Zvkn and Zvks
- In Zvbb, add vrev.v, vclz.v, vctz.v, vcpop.v, vwsll.{vv,vx,vi}
Differential Revision: https://reviews.llvm.org/D148483
Repurpose the DW_OP_LLVM_aspace_implicit_pointer encoding (0xe9) as the
encoding for a new operation DW_OP_LLVM_user which prefixes all other
new operations.
Differential Revision: https://reviews.llvm.org/D147265
2a5661c8415876be3fbd56ce90c2031e89ba0ef3 added a new external link with
the link text "0.2 draft specification". Surprisingly, as multiple links
have this same text but different targets this causes a warning, which
causes a failure on the llvm-sphinx-docs builder (which treats warnings
as errors). As suggested in
<https://github.com/sphinx-doc/sphinx/issues/3921>, this commit moves to
using anonymous references for the links in the experimental extensions
section.
As of
1f03818281
in the riscv-isa-manual, Zfa is at version 0.2. Reviewing the commit
history for
zfa.tex
<https://github.com/riscv/riscv-isa-manual/commits/master/src/zfa.tex>
there are no relevant changes since 0.1. As such, we can simply
increment the version number.
This change also removes the claim in RISCVUsage that we implement a
"subset of" Zfa, as I believe this is no longer true. That sentence
previously incorrectly claimed we didn't implement fli.{h,s,d} (I
[corrected this a couple of weeks
ago](https://reviews.llvm.org/rG3d969191b277)) but I think should have
removed the "subset of" wording too.
As was noted during the review, we never added Zfa to the release notes.
This is corrected in this patch.
Differential Revision: https://reviews.llvm.org/D148634
UniformityAnalysis offers all of the same features and much more, there is no reason left to use the legacy DAs.
See RFC: https://discourse.llvm.org/t/rfc-deprecate-divergenceanalysis-legacydivergenceanalysis/69538
- Remove LegacyDivergenceAnalysis.h/.cpp
- Remove DivergenceAnalysis.h/.cpp + Unit tests
- Remove SyncDependenceAnalysis - it was not a real registered analysis and was only used by DAs
- Remove/adjust references to the passes in the docs where applicable
- Remove TTI hook associated with those passes.
- Move tests to UniformityAnalysis folder.
- Remove RUN lines for the DA, leave only the UA ones.
- Some tests had to be adjusted/removed depending on how they used the legacy DAs.
Reviewed By: foad, sameerds
Differential Revision: https://reviews.llvm.org/D148116
Based on the direction of IR Verifier changes in D146845, this documentation
needs to be updated.
Differential Revision: https://reviews.llvm.org/D148138
Remove C APIs for interacting with PassRegistry and pass
initialization. These are legacy PM concepts, and are no longer
relevant for the new pass manager.
Calls to these initialization functions can simply be dropped.
Differential Revision: https://reviews.llvm.org/D145043
I believe !dereferencable violation is immediate undefined behavior,
but this was not explicitly spelled out in LangRef. We already
assume that !dereferenceable is implicitly !noundef and cannot
return poison in isGuaranteedNotToBeUndefOrPoison().
The reason why we made dereferenceable implicitly noundef is that
the purpose of this metadata is to allow speculation, and that
would not be legal on a potential poison pointer.
Differential Revision: https://reviews.llvm.org/D148202
As discussed in [0], add a `weight` field to temporal profiling traces found in profiles. This allows users to use the `--weighted-input=` flag in the `llvm-profdata merge` command to weight traces from different scenarios differently.
Note that this is a breaking change, but since [1] landed very recently and there is no way to "use" this trace data, there should be no users of this feature. We believe it is acceptable to land this change without bumping the profile format version.
[0] https://reviews.llvm.org/D147812#4259507
[1] https://reviews.llvm.org/D147287
Reviewed By: snehasish
Differential Revision: https://reviews.llvm.org/D148150
This change uses the information from target.xml sent by
the GDB stub to produce C types that we can use to print
register fields.
lldb-server *does not* produce this information yet. This will
only work with GDB stubs that do. gdbserver or qemu
are 2 I know of. Testing is added that uses a mocked lldb-server.
```
(lldb) register read cpsr x0 fpcr fpsr x1
cpsr = 0x60001000
= (N = 0, Z = 1, C = 1, V = 0, TCO = 0, DIT = 0, UAO = 0, PAN = 0, SS = 0, IL = 0, SSBS = 1, BTYPE = 0, D = 0, A = 0, I = 0, F = 0, nRW = 0, EL = 0, SP = 0)
```
Only "register read" will display fields, and only when
we are not printing a register block.
For example, cpsr is a 32 bit register. Using the target's scratch type
system we construct a type:
```
struct __attribute__((__packed__)) cpsr {
uint32_t N : 1;
uint32_t Z : 1;
...
uint32_t EL : 2;
uint32_t SP : 1;
};
```
If this register had unallocated bits in it, those would
have been filled in by RegisterFlags as anonymous fields.
A new option "SetChildPrintingDecider" is added so we
can disable printing those.
Important things about this type:
* It is packed so that sizeof(struct cpsr) == sizeof(the real register).
(this will hold for all flags types we create)
* Each field has the same storage type, which is the same as the type
of the raw register value. This prevents fields being spilt over
into more storage units, as is allowed by most ABIs.
* Each bitfield size matches that of its register field.
* The most significant field is first.
The last point is required because the most significant bit (MSB)
being on the left/top of a print out matches what you'd expect to
see in an architecture manual. In addition, having lldb print a
different field order on big/little endian hosts is not acceptable.
As a consequence, if the target is little endian we have to
reverse the order of the fields in the value. The value of each field
remains the same. For example 0b01 doesn't become 0b10, it just shifts
up or down.
This is needed because clang's type system assumes that for a struct
like the one above, the least significant bit (LSB) will be first
for a little endian target. We need the MSB to be first.
Finally, if lldb's host is a different endian to the target we have
to byte swap the host endian value to match the endian of the target's
typesystem.
| Host Endian | Target Endian | Field Order Swap | Byte Order Swap |
|-------------|---------------|------------------|-----------------|
| Little | Little | Yes | No |
| Big | Little | Yes | Yes |
| Little | Big | No | Yes |
| Big | Big | No | No |
Testing was done as follows:
* Little -> Little
* LE AArch64 native debug.
* Big -> Little
* s390x lldb running under QEMU, connected to LE AArch64 target.
* Little -> Big
* LE AArch64 lldb connected to QEMU's GDB stub, which is running
an s390x program.
* Big -> Big
* s390x lldb running under QEMU, connected to another QEMU's GDB
stub, which is running an s390x program.
As we are not allowed to link core code to plugins directly,
I have added a new plugin RegisterTypeBuilder. There is one implementation
of this, RegisterTypeBuilderClang, which uses TypeSystemClang to build
the CompilerType from the register fields.
Reviewed By: jasonmolenda
Differential Revision: https://reviews.llvm.org/D145580