This patch-set aims to simplify the existing RVV segment load/store
intrinsics to use a type that represents a tuple of vectors instead.
To achieve this, first we need to relax the current limitation for an
aggregate type to be a target of load/store/alloca when the aggregate
type contains homogeneous scalable vector types. Then to adjust the
prolog of an LLVM function during lowering to clang. Finally we
re-define the RVV segment load/store intrinsics to use the tuple types.
The pull request under the RVV intrinsic specification is
riscv-non-isa/rvv-intrinsic-doc#198
---
This is the 1st patch of the patch-set. This patch is originated from
D98169.
This patch allows aggregate type (StructType) that contains homogeneous
scalable vector types to be a target of load/store/alloca. The RFC of
this patch was posted in LLVM Discourse.
https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527
The main changes in this patch are:
Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to
accommodate an expression of scalable size.
Allow `StructType:isSized` to also return true for homogeneous
scalable vector types.
Let `Type::isScalableTy` return true when `Type` is `StructType`
and contains scalable vectors
Extra description is added in the LLVM Language Reference Manual on the
relaxation of this patch.
Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Co-Authored-by: eop Chen <eop.chen@sifive.com>
Reviewed By: craig.topper, nikic
Differential Revision: https://reviews.llvm.org/D146872
This patch adds a new method setNoSanitizeMetadata() for Instruction, and use it in SanitizerMetadata and SanitizerCoverage.
Reviewed By: nickdesaulniers, MaskRay
Differential Revision: https://reviews.llvm.org/D150632
This commit removes constness from DILocation::getMergedLocation and
fixes all its users accordingly.
Having constness on the parameters forced the return type to be const
as well, which does force usage of `const_cast` when the location needs
to be used in metadata nodes.
Reviewed By: ftynse
Differential Revision: https://reviews.llvm.org/D149942
A follow up patch will make the CoroSplit pass introduce such operations in the
IR level when it is safe to do so.
Depends on D149748
Differential Revision: https://reviews.llvm.org/D149778
This change enables loading pseudo-probe based profile on MIR. Different from the IR profile loader, callsites are excluded from MIR profile loading since they are not assinged a FS discriminator. Using zero as the discriminator is not accurate and would undo the distribution work done by the IR loader based on pseudo probe distribution factor. We reply on block probes only for FS profile loading.
Some refactoring is done to the IR profile loader so that `getProbeWeight` can be shared by both loaders.
Reviewed By: wenlei
Differential Revision: https://reviews.llvm.org/D148584
Annotation metadata supports adding singular annotation strings to annotation block. This patch adds the ability to insert a tuple of strings into the metadata array.
The idea here is that each tuple of strings represents a piece of information that can be all related. It makes it easier to parse through related metadata information given it will be contained in one tuple.
For example in remarks any pass that implements annotation remarks can have different type of remarks and pass additional information for each.
The original behaviour of annotation remarks is preserved here and we can mix tuple annotations and single annotations for the same instruction.
Reviewed By: paquette
Differential Revision: https://reviews.llvm.org/D148328
Add "Hot" AllocationType (in addition to existing cold, notcold).
Use lifetime access density as metric to identify hot allocations.
Treat hot as notcold for MemProfContextDisambiguation for now
before the disambiguation for "hot" is done.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D149932
Adds an LTO option to indicate that whether we are linking with an
allocator that supports hot/cold operator new interfaces. If not,
at the start of the LTO backends any existing memprof hot/cold
attributes are removed from the IR, and we also remove memprof metadata
so that post-LTO inlining doesn't add any new attributes.
This is done via setting a new flag in the module summary index. It is
important to communicate via the index to the LTO backends so that
distributed ThinLTO handles this correctly, as they are invoked by
separate clang processes and the combined index is how we communicate
information from the LTO link. Specifically, for distributed ThinLTO the
LTO related processes look like:
```
# Thin link:
$ lld --thinlto-index-only obj1.o ... objN.o -llib ...
# ThinLTO backends:
$ clang -x ir obj1.o -fthinlto-index=obj1.o.thinlto.bc -c -O2
...
$ clang -x ir objN.o -fthinlto-index=objN.o.thinlto.bc -c -O2
```
It is during the thin link (lld --thinlto-index-only) that we have
visibility into linker dependences and want to be able to pass the new
option via -Wl,-supports-hot-cold-new. This will be recorded in the
summary indexes created for the distributed backend processes
(*.thinlto.bc) and queried from there, so that we don't need to know
during those individual clang backends what allocation library was
linked. Since in-process ThinLTO and regular LTO also use a combined
index, for consistency we query the flag out of the index in all LTO
backends.
Additionally, when the LTO option is disabled, exit early from the
MemProfContextDisambiguation handling performed during LTO, as this is
unnecessary.
Depends on D149117 and D149192.
Differential Revision: https://reviews.llvm.org/D149215
Do not convert dbg.declares to dbg.assigns for variables backed by scalable
vector allocas as this isn't yet supported.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D149959
Re-land D145441 with data layout upgrade code fixed to not break OpenMP.
This reverts commit 3f2fbe92d0f40bcb46db7636db9ec3f7e7899b27.
Differential Revision: https://reviews.llvm.org/D149776
Per discussion at
https://discourse.llvm.org/t/representing-buffer-descriptors-in-the-amdgpu-target-call-for-suggestions/68798,
we define two new address spaces for AMDGCN targets.
The first is address space 7, a non-integral address space (which was
already in the data layout) that has 160-bit pointers (which are
256-bit aligned) and uses a 32-bit offset. These pointers combine a
128-bit buffer descriptor and a 32-bit offset, and will be usable with
normal LLVM operations (load, store, GEP). However, they will be
rewritten out of existence before code generation.
The second of these is address space 8, the address space for "buffer
resources". These will be used to represent the resource arguments to
buffer instructions, and new buffer intrinsics will be defined that
take them instead of <4 x i32> as resource arguments. ptr
addrspace(8). These pointers are 128-bits long (with the same
alignment). They must not be used as the arguments to getelementptr or
otherwise used in address computations, since they can have
arbitrarily complex inherent addressing semantics that can't be
represented in LLVM. Even though, like their address space 7 cousins,
these pointers have deterministic ptrtoint/inttoptr semantics, they
are defined to be non-integral in order to prevent optimizations that
rely on pointers being a [0, [addr_max]] value from applying to them.
Future work includes:
- Defining new buffer intrinsics that take ptr addrspace(8) resources.
- A late rewrite to turn address space 7 operations into buffer
intrinsics and offset computations.
This commit also updates the "fallback address space" for buffer
intrinsics to the buffer resource, and updates the alias analysis
table.
Depends on D143437
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D145441
This reverts commit c117c2c8ba4afd45a006043ec6dd858652b2ffcc.
itaniumDemangle calls std::strlen with the results of
std::string_view::data() which may not be NUL-terminated. This causes
lld/test/wasm/why-extract.s to fail when "expensive checks" are enabled
via -DLLVM_ENABLE_EXPENSIVE_CHECKS=ON. See D149675 for further
discussion. Back this out until the individual demanglers are converted
to use std::string_view.
Removes the 'notcoldandcold' allocation type summary
(de)serialization support added in D135714, after realizing that this
will never be generated in practice.
There are 2 uses of the allocation type keywords in the summary. One is
for the individual profiled memprof context summaries, and each context
can only be assigned a single type of hotness. The second is in the
clone version information produced by the MemProfContextDisambiguation
whole program step, and we only create a clone for a specific allocation
type.
Differential Revision: https://reviews.llvm.org/D149669
As suggested by @erichkeane in
https://reviews.llvm.org/D141451#inline-1429549
There's potential for a lot more cleanups around these APIs. This is
just a start.
Callers need to be more careful about sub-expressions producing strings
that don't outlast the expression using ``llvm::demangle``. Add a
release note.
Reviewed By: MaskRay, #lld-macho
Differential Revision: https://reviews.llvm.org/D149104
This is stricter than the default "ieee", and should probably be the
default. This patch leaves the default alone. I can change this in a
future patch.
There are non-reversible transforms I would like to perform which are
legal under IEEE denormal handling, but illegal with flushing zero
behavior. Namely, conversions between llvm.is.fpclass and fcmp with
zeroes.
Under "ieee" handling, it is legal to translate between
llvm.is.fpclass(x, fcZero) and fcmp x, 0.
Under "preserve-sign" handling, it is legal to translate between
llvm.is.fpclass(x, fcSubnormal|fcZero) and fcmp x, 0.
I would like to compile and distribute some math library functions in
a mode where it's callable from code with and without denormals
enabled, which requires not changing the compares with denormals or
zeroes.
If an IEEE function transforms an llvm.is.fpclass call into an fcmp 0,
it is no longer possible to call the function from code with denormals
enabled, or write an optimization to move the function into a denormal
flushing mode. For the original function, if x was a denormal, the
class would evaluate to false. If the function compiled with denormal
handling was converted to or called from a preserve-sign function, the
fcmp now evaluates to true.
This could also be of use for strictfp handling, where code may be
changing the denormal mode.
Alternative name could be "unknown".
Replaces the old AMDGPU custom inlining logic with more conservative
logic which tries to permit inlining for callees with dynamic handling
and avoids inlining other mismatched modes.
Without this patch, in `getAssignmentInfo` the result of `getTypeSizeInBits` is
cast to `uint64_t`, which a) is an operation that will eventually be
unsupported by the API according to the comments, and b) causes an assertion
failure if the type is a scalable vector. Don't cast the `TypeSize` to
`uint64_t` and check `isScalable` before getting the fixed size.
This can result in incorrect variable locations, see llvm.org/PR62346 (but is
better than crashing).
Reviewed By: paulwalker-arm
Differential Revision: https://reviews.llvm.org/D149137
This makes the logic for referenced globals reusable for import criteria
that don't use thresholds - in fact, we currently didn't consider any
thresholds when importing.
Differential Revision: https://reviews.llvm.org/D149298
Following the change in shufflevector semantics,
poison will be used to represent undefined elements in shufflevector masks.
Differential Revision: https://reviews.llvm.org/D149256
With this patch an undefined mask in a shufflevector will be printed as poison.
This change is done to support the new shufflevector semantics
for undefined mask elements.
Differential Revision: https://reviews.llvm.org/D149210
This improves the readability of debugging intrinsics. Instead of:
call void @llvm.dbg.value(metadata !2, ...)
!2 = !{}
We will see:
call void @llvm.dbg.value(metadata !{}, ...)
!2 = !{}
Note that we still get a numbered metadata entry for the node even if it's not
used elsewhere. This is to avoid adding more context to the print functions.
This is already legal IR - LLVM can parse and understand it - so there is no
need to update the parser.
The next patches in this stack will make such empty metadata operands more
common and semantically important.
Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value
Reviewed By: StephenTozer
Differential Revision: https://reviews.llvm.org/D140900
- Define `IIT_Info` in `Intrinsics.td`
- Implement `EmitIITInfo` in `IntrinsicEmitter.cpp`
- Use generated `IIT_Info` in `Function.cpp`
Depends on D145873 and D146179
Differential Revision: https://reviews.llvm.org/D146914
The out-param vector from findDbgValues and findDbgUsers should not include
duplicates, which is possible if the debug intrinsic uses the value multiple
times. This filter is already in place for multiple uses in a `DIArgLists`;
extend it to cover dbg.assigns too because a Value may be used in both the
address and value components.
Additionally, refactor the duplicated functionality between findDbgValues and
FindDbgUsers into a new function findDbgIntrinsics.
Reviewed By: jmorse, StephenTozer
Differential Revision: https://reviews.llvm.org/D148788
This exposed another miscompile in GVN, which was fixed by
20e9b31f88149a1d5ef78c0be50051e345098e41.
-----
After D141386, violation of nonnull, range and align metadata
results in poison rather than immediate undefined behavior,
which means that these are now safe to retain when speculating.
We only need to remove UB-implying metadata like noundef.
This is done by adding a dropUBImplyingAttrsAndMetadata() helper,
which lists the metadata which is known safe to retain on speculation.
Differential Revision: https://reviews.llvm.org/D146629
`shortenAssignment` inserts dbg.assigns with fragments describing the dead part
of a shortened store after each dbg.assign linked to the store.
Without this patch it doesn't take into account that the dead part of a
shortened store may be outside the bounds of a variable of a linked
dbg.assign. It also doesn't correctly account for a non-zero offset in the
address modifying `DIExpression` of the dbg.assign (which is possible for
fragments now even though whole variables currently cannot have a non-zero
offset in their alloca).
Fix this by moving the dead slice into variable-space and performing an
intersect of that adjusted slice with the existing fragment.
This fixes a verifier error reported when building fuchsia with assignment
tracking enabled:
https://ci.chromium.org/ui/p/fuchsia/builders/ci/
clang_toolchain.ci.core.x64-release/b8784000953022145169/overview
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D148536
Debug intrinsics sometimes end up with empty metadata location operands. The
debug intrinsic interfaces return nullptr when retrieving location operand in
this case.
Skip empty-metadata dbg.declares to avoid dereferencing the nullptr. This
doesn't affect the final debug info in any way.
Reviewed By: jryans
Differential Revision: https://reviews.llvm.org/D148204
This patch replaces the uses of PointerUnion.is function by llvm::isa,
PointerUnion.get function by llvm::cast, and PointerUnion.dyn_cast by
llvm::dyn_cast_if_present. This is according to the FIXME in
the definition of the class PointerUnion.
This patch does not remove them as they are being used in other
subprojects.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D148449
This exposed a miscompile in GVN, which was fixed by D148129.
-----
After D141386, violation of nonnull, range and align metadata
results in poison rather than immediate undefined behavior,
which means that these are now safe to retain when speculating.
We only need to remove UB-implying metadata like noundef.
This is done by adding a dropUBImplyingAttrsAndMetadata() helper,
which lists the metadata which is known safe to retain on speculation.
Differential Revision: https://reviews.llvm.org/D146629
Remove C APIs for interacting with PassRegistry and pass
initialization. These are legacy PM concepts, and are no longer
relevant for the new pass manager.
Calls to these initialization functions can simply be dropped.
Differential Revision: https://reviews.llvm.org/D145043
Currently, FunctionAttrs treats landingpads as non-throwing, and
will infer nounwind for functions with landingpads (assuming they
can't unwind in some other way, e.g. via resum). There are two
problems with this:
* Non-cleanup landingpads with catch/filter clauses do not
necessarily catch all exceptions. Unless there are catch ptr null
or filter [0 x ptr] zeroinitializer clauses, we should assume
that we may unwind past this landingpad. This seems like an
outright bug.
* Cleanup landingpads are skipped during phase one unwinding, so
we effectively need to support unwinding past them. Marking these
nounwind is technically correct, but not compatible with how
unwinding works in reality.
Fixes https://github.com/llvm/llvm-project/issues/61945.
Differential Revision: https://reviews.llvm.org/D147694
Repeatedly calling getName adds some overhead, which can be easily
avoided by querying the name just once per function. The improvements
are rather small (~0.5% back-end time in a compile-time optimized
setting), but also very easy to achieve.
Note that getting the name should be entirely avoidable in the common
case, but would require more substantial changes.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D148145
Even when time tracing is disabled, getPassName is currently still
called. This adds an avoidable virtual function call for each pass.
Fetching the pass name only when required slightly improves
compile-time (particularly when LLVM is built without LTO).
Reviewed By: nikic, MaskRay
Differential Revision: https://reviews.llvm.org/D148022