[llvm-debuginfo-analyzer] Fix crash due to un-checked error in LVReaderHandler::handleArchive
method.
- Added README describing how to generated the binary files used for the test.
- A follow up patch to add extra ASSERT_NE
Committed on behalf of @aurelien35
This adds support for the loongarch64 architecture to the offload host
plugin.
Similar to #115773
To fix some test issues, I've had to add the LoongArch64 target to:
- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)
Reviewed By: jhuber6
Pull Request: https://github.com/llvm/llvm-project/pull/120173
This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to
check that it is safe to fold a store into a node that will expand to a
library call that takes output pointers. This requires checking for two
(independent) properties:
1. The store is not within a CALLSEQ_START..CALLSEQ_END pair
* If it is, the expansion would lead to nested call sequences (which is
invalid)
2. The node does not appear as a predecessor to the store
* If it does, attempting to merge the store into the call would result
in a cycle in the DAG
These two properties are checked as part of the same traversal in
`canFoldStoreIntoLibCallOutputPointers()`
Do not remove S_CBRANCH_EXECZ if one of the following blocks contains an
unconditional branch to a block other than the one immediately following
it. This can cause unwanted behavior like infinite loops.
Clean up `populateVectorToLLVMConversionPatterns` so that it populates
only conversion patterns. All rewrite patterns that do not lower to LLVM
should be populated into a separate greedy pattern rewrite.
The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.
Depends on #119973.
The mask materialization patterns during `VectorToLLVM` are rewrite
patterns. They should run as part of the greedy pattern rewrite and not
the dialect conversion. (Rewrite patterns and conversion patterns are
not generally compatible.)
The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.
When two 16-bit values are combined into a v2x16 vector, and those
values are truncated come from 32-bit values, a PRMT instruction can
save registers by selecting bytes directly from the original 32-bit
values. We do this during a post-legalize DAG combine, as these
opportunities are typically only exposed after the BUILD_VECTOR's
operands have been legalized.
Additionally, if the 32-bit values are right-shifted, we can fold in the
shift by selecting higher bytes with PRMT. Only logical right-shifts by
16 are supported (for now) since those are the only situations seen in
practice. Right shifts by 16 often come up during the legalization of
EXTRACT_VECTOR_ELT.
This idea was brought up in a PR comment by @Artem-B.
If x is NaN, then fmul (x, 1) may produce a different NaN value.
Our float semantics explicitly permit folding fmul (x, 1) to x, but we
can't do this when we're replacing a select input, as selects are
supposed to preserve the exact bitwise value.
Fixes
https://github.com/llvm/llvm-project/pull/115152#issuecomment-2545773114.
Summary:
Previously, we'd add all SPs distinct from the cloned one into a set.
Then when cloning a local scope we'd check if it's from one of those
'distinct' SPs by checking if it's in the set. We don't need to do that.
We can just check against the cloned SP directly and drop the set.
Test Plan:
ninja check-llvm-unit check-llvm
This makes sure no optimizations are applied that assume the
bigger alignment or size, which could be incorrect if we link
together with non-instrumented code.
Lowering to load-acquire/store-release for RISCV Zalasr.
Currently uses the psABI lowerings for WMO load-acquire/store-release
(which are identical to A.7). These are incompatable with the A.6
lowerings currently used by LLVM. This should be OK for now since Zalasr
is behind the enable experimental extensions flag, but needs to be fixed
before it is removed from that.
For TSO, it uses the standard Ztso mappings except for lowering seq_cst
loads/store to load-acquire/store-release, I had Andrea review that.
```
% echo 90 | llvm-mc -triple=x86_64 --disassemble --hex
.text
nop
```
The initial `.text` kludge is due `initSection`, which is actually only
needed by AIX XCOFF for its `getCurrentSectionOnly()` use in
MCAsmStreamer::emitInstruction (https://reviews.llvm.org/D95518). Adjust
MCAsmStreamer::emitInstruction to not trigger failures on
```
echo 7c4303a6 | llvm-mc --cdis --hex --triple=powerpc-aix-ibm-xcoff
```
Pull Request: https://github.com/llvm/llvm-project/pull/120185
Depends on #113811
Support `R_AARCH64_AUTH_ADR_GOT_PAGE`, `R_AARCH64_AUTH_GOT_LO12_NC` and
`R_AARCH64_AUTH_GOT_ADD_LO12_NC` GOT-generating relocations. For preemptible
symbols, dynamic relocation `R_AARCH64_AUTH_GLOB_DAT` is emitted. Otherwise,
we unconditionally emit `R_AARCH64_AUTH_RELATIVE` dynamic relocation since
pointers in signed GOT needs to be signed during dynamic link time.
This fixes a bug where report links generated from files such as
StylePrimitiveNumericTypes+Conversions.h in WebKit result in an error.
Co-authored-by: Brianna Fan <bfan2@apple.com>
This should actually fix the problem as I validated that github.sha returns an
actual value by running a workflow in a test repo. I'm not sure why the
existing value doesn't work, but it returns nothing.
Introduces a new layer interface, LinkGraphLayer, that can be used to
add LinkGraphs to an ExecutionSession.
This patch moves most of ObjectLinkingLayer's functionality into a new
LinkGraphLinkingLayer which should (in the future) be able to be used
without linking libObject. ObjectLinkingLayer now inherits from
LinkGraphLinkingLayer and just handles conversion of object files to
LinkGraphs, which are then handed down to LinkGraphLinkingLayer to be
linked.
AMDGPU: Delete spills of undef values
It would be a bit more logical to preserve the undef and do the normal
expansion, but this is less work. This avoids verifier errors in a
future patch which starts deleting liveness from registers after
allocation failures which results in spills of undef values.
https://reviews.llvm.org/D122607
Move where undef sgpr spills are deleted
This reverts commit f7443905af1e06eaacda1e437fff8d54dc89c487.
This is to avoid an assertion if an undef operand appears in a
stackmap. This is important to avoid hitting verifier errors
when register allocation starts adding undefs in error scenarios.
Rather than trying to treat undef operands as special, leave them
alone and avoid producing an invalid spill. It would a bit more
precise to produce a spill of an undef register here, but that's not
exposed through the storeRegToStackSlot API.
https://reviews.llvm.org/D122605
This was an alternative to https://reviews.llvm.org/D122582
Identical Code Folding (ICF) folds functions that are identical into one
function, and updates symbol addresses to the new address. This reduces
the size of a binary, but can lead to problems. For example when
function pointers are compared. This can be done either explicitly in
the code or generated IR by optimization passes like Indirect Call
Promotion (ICP). After ICF what used to be two different addresses
become the same address. This can lead to a different code path being
taken.
This is where safe ICF comes in. Linker (LLD) does it using address
significant section generated by clang. If symbol is in it, or an object
doesn't have this section symbols are not folded.
BOLT does not have the information regarding which objects do not have
this section, so can't re-use this mechanism.
This implementation scans code section and conservatively marks
functions symbols as unsafe. It treats symbols as unsafe if they are
used in non-control flow instruction. It also scans through the data
relocation sections and does the same for relocations that reference a
function symbol. The latter handles the case when function pointer is
stored in a local or global variable, etc. If a relocation address
points within a vtable these symbols are skipped.
We already support vector types, and since matrix element types have to
be scalar types, there should be no problem w/ just enabling this.
This now also allows matrix types to be stored in STL containers.
`--disassemble`/`--cdis` parses input bytes as decimal, 0bbin, 0ooct, or
0xhex. While the hexadecimal digit form is most commonly used, requiring
a 0x prefix for each byte (`0x48 0x29 0xc3`) is cumbersome.
Tools like xxd -p and rz-asm use a plain hex dump form without the 0x
prefix or space separator. This patch adds --hex to disassemble such hex
bytes with optional whitespace.
```
% rz-asm -a x86 -b 64 -d 4829c34829c4
sub rbx, rax
sub rsp, rax
% llvm-mc -triple=x86_64 --cdis --hex --output-asm-variant=1 <<< 4829c34829c4
.text
sub rbx, rax
sub rsp, rax
```
Pull Request: https://github.com/llvm/llvm-project/pull/119992
CodeGen will allocate memory for a new descriptor on descriptor loads.
CUDA Fortran local descriptor are allocated in managed memory by the
runtime. The newly allocated storage for cuda descriptor must also be
allocated through the runtime.
VPInstruction has a definition of mayWriteToMemory, which seems to only
be used by VPlanSLP. However VPInstructions are already handled in
VPRecipeBase::mayWriteToMemory, and everywhere else seems to use this
definition. I think these should be the same for all intents and
purposes. The VPRecipeBase definition is more conservative but returns
true for stores/calls/invokes/SLPStores.