521792 Commits

Author SHA1 Message Date
SpencerAbson
908e30658d
[AArch64] Implement intrinsics for FP8 SME FMLAL/FMLALL (multi) (#119546)
This patch implements the following intrinsics:

Multi-vector 8-bit floating-point multiply-add long (multiple vectors).

``` c
// Only if __ARM_FEATURE_SME_F8F16 != 0
void svmla_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm,
                                fpm_t fpm) __arm_streaming __arm_inout("za");

void svmla_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm,
                                fpm_t fpm) __arm_streaming __arm_inout("za");
// Only if __ARM_FEATURE_SME_F8F32 != 0
void svmla_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn, svmfloat8x2_t zm,
                                fpm_t fpm) __arm_streaming __arm_inout("za");

void svmla_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn, svmfloat8x4_t zm,
                                fpm_t fpm) __arm_streaming __arm_inout("za");                              
```

In accordance with https://github.com/ARM-software/acle/pull/323
2024-12-17 11:47:20 +00:00
AdUhTkJm
449af81f92
[Clang] Fix crash for incompatible types in inline assembly (#119098)
Fixed issue #118892.
2024-12-17 18:44:25 +07:00
Lang Hames
24c2744a18 [ORC] Fix LazyReexports resource key management.
Multiple reentry points may be associated with a single key.
2024-12-17 22:38:15 +11:00
paperchalice
b07e7b76c5
[cmake] Drop AddFileDependencies and CMakeParseArguments (#120002)
Theses modules are deprecated and have trivial implementations in modern
cmake.
2024-12-17 19:24:32 +08:00
aurelien35
37e48e4a73
Fix crash due to un-checked error in LVReaderHandler::handleArchive method (#118951)
[llvm-debuginfo-analyzer] Fix crash due to un-checked error in LVReaderHandler::handleArchive
method.

- Added README describing how to generated the binary files used for the test.
- A follow up patch to add extra ASSERT_NE

Committed on behalf of @aurelien35
2024-12-17 11:19:48 +00:00
wanglei
bdf727065b
[Offload] Add support for loongarch64 to host plugin
This adds support for the loongarch64 architecture to the offload host
plugin.

Similar to #115773

To fix some test issues, I've had to add the LoongArch64 target to:

- CompilerInvocation::ParseLangArgs
- linkDevice in ClangLinuxWrapper.cpp
- OMPContext::OMPContext (to set the device_kind_cpu trait)

Reviewed By: jhuber6

Pull Request: https://github.com/llvm/llvm-project/pull/120173
2024-12-17 19:06:10 +08:00
Benjamin Maxwell
a7dafea384
[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117)
This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to
check that it is safe to fold a store into a node that will expand to a
library call that takes output pointers. This requires checking for two
(independent) properties:

1. The store is not within a CALLSEQ_START..CALLSEQ_END pair
* If it is, the expansion would lead to nested call sequences (which is
invalid)
2. The node does not appear as a predecessor to the store
* If it does, attempting to merge the store into the call would result
in a cycle in the DAG

These two properties are checked as part of the same traversal in
`canFoldStoreIntoLibCallOutputPointers()`
2024-12-17 10:54:17 +00:00
Mirko Brkušanin
f7988a338d
[AMDGPU][SIPreEmitPeephole] Fix mustRetainExeczBranch (#120121)
Do not remove S_CBRANCH_EXECZ if one of the following blocks contains an
unconditional branch to a block other than the one immediately following
it. This can cause unwanted behavior like infinite loops.
2024-12-17 11:47:38 +01:00
Matt Arsenault
10b12e6e07
LiveVariables: Use Register (#120204) 2024-12-17 17:45:24 +07:00
Matthias Springer
0693b9e9cc
[mlir][Vector] Clean up populateVectorToLLVMConversionPatterns (#119975)
Clean up `populateVectorToLLVMConversionPatterns` so that it populates
only conversion patterns. All rewrite patterns that do not lower to LLVM
should be populated into a separate greedy pattern rewrite.

The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.

Depends on #119973.
2024-12-17 11:37:17 +01:00
Nikolas Klauser
59890c1334
[libc++] Granularize <new> includes (#119964) 2024-12-17 11:29:16 +01:00
Matthias Springer
8cd8b5079b
[mlir][Vector] Move mask materialization patterns to greedy rewrite (#119973)
The mask materialization patterns during `VectorToLLVM` are rewrite
patterns. They should run as part of the greedy pattern rewrite and not
the dialect conversion. (Rewrite patterns and conversion patterns are
not generally compatible.)

The current combination of rewrite patterns and conversion patterns
triggered an edge case when merging the 1:1 and 1:N dialect conversions.
2024-12-17 11:26:31 +01:00
Fraser Cormack
a1f5fe8c85
[NVPTX] Optimize v2x16 BUILD_VECTORs to PRMT (#116675)
When two 16-bit values are combined into a v2x16 vector, and those
values are truncated come from 32-bit values, a PRMT instruction can
save registers by selecting bytes directly from the original 32-bit
values. We do this during a post-legalize DAG combine, as these
opportunities are typically only exposed after the BUILD_VECTOR's
operands have been legalized.

Additionally, if the 32-bit values are right-shifted, we can fold in the
shift by selecting higher bytes with PRMT. Only logical right-shifts by
16 are supported (for now) since those are the only situations seen in
practice. Right shifts by 16 often come up during the legalization of
EXTRACT_VECTOR_ELT.

This idea was brought up in a PR comment by @Artem-B.
2024-12-17 10:22:19 +00:00
Nikita Popov
7c135e17fb
[InstSimplify] Treat float binop with identity as refining (#120098)
If x is NaN, then fmul (x, 1) may produce a different NaN value.

Our float semantics explicitly permit folding fmul (x, 1) to x, but we
can't do this when we're replacing a select input, as selects are
supposed to preserve the exact bitwise value.

Fixes
https://github.com/llvm/llvm-project/pull/115152#issuecomment-2545773114.
2024-12-17 10:58:52 +01:00
Andrei Safronov
75b2d78673
[GitHub] Add Xtensa backend labeler. (#120133)
Add patterns to label Xtensa backend related changes automatically.
2024-12-17 12:51:52 +03:00
Mariya Podchishchaeva
e5a6f1c779
[NFC][webkit.UncountedLambdaCapturesChecker] Remove unnecessary check (#120069)
CXXMD is checked for null, but it can't be null inside of a visitor's
method. Found by a static analyzer tool.
2024-12-17 10:45:10 +01:00
SpencerAbson
9c89b40f18
[AArch64] Implement intrinsics for FMLAL/FMLALL (single) (#119568)
Multi-vector 8-bit floating-point multiply-add long (single)
```c
// Only if __ARM_FEATURE_SME_F8F16 != 0
void svmla[_single]_za16[_mf8]_vg2x1_fpm(uint32_t slice, svmfloat8_t zn,
                                         svmfloat8_t zm, fpm_t fpm)
                                         __arm_streaming __arm_inout("za");

void svmla[_single]_za16[_mf8]_vg2x2_fpm(uint32_t slice, svmfloat8x2_t zn,
                                         svmfloat8_t zm, fpm_t fpm)
                                         __arm_streaming __arm_inout("za");

void svmla[_single]_za16[_mf8]_vg2x4_fpm(uint32_t slice, svmfloat8x4_t zn,
                                         svmfloat8_t zm, fpm_t fpm)
                                         __arm_streaming __arm_inout("za");
// Only if __ARM_FEATURE_SME_F8F32 != 0
void svmla[_single]_za32[_mf8]_vg4x1_fpm(uint32_t slice, svmfloat8_t zn,
                                         svmfloat8_t zm, fpm_t fpm)
                                         __arm_streaming __arm_inout("za");

void svmla[_single]_za32[_mf8]_vg4x2_fpm(uint32_t slice, svmfloat8x2_t zn,
                                         svmfloat8_t zm, fpm_t fpm)
                                         __arm_streaming __arm_inout("za");

void svmla[_single]_za32[_mf8]_vg4x4_fpm(uint32_t slice, svmfloat8x4_t zn,
                                         svmfloat8_t zm, fpm_t fpm)
                                         __arm_streaming __arm_inout("za");
 ```
 In accordance with https://github.com/ARM-software/acle/pull/323.
 
Co-authored-by: Momchil Velikov momchil.velikov@arm.com
2024-12-17 09:31:54 +00:00
David Green
2a7ed2c1aa [SROA] Protect against calling the alloca ptr
In case we are calling the alloca ptr directly, check that the Use is a normal
operand to the call. Fortran is a funny language.
2024-12-17 09:21:15 +00:00
Matt Arsenault
3508d8f6dd
RegAllocFast: Avoid using temporary DiagnosticInfo (#120184)
This reverts commit 1297933f35b4948b4d281259627a72094c407a75.
2024-12-17 16:19:26 +07:00
Simon Pilgrim
df2356b475 [X86] getShuffleCost - ensure we treat constant folded shuffles as free 2024-12-17 09:01:54 +00:00
Artem Pianykh
fbdbb13d5b
[NFC][Utils] Eliminate DISubprogram set from BuildDebugInfoMDMap (#118625)
Summary:
Previously, we'd add all SPs distinct from the cloned one into a set.
Then when cloning a local scope we'd check if it's from one of those
'distinct' SPs by checking if it's in the set. We don't need to do that.
We can just check against the cloned SP directly and drop the set.

Test Plan:
ninja check-llvm-unit check-llvm
2024-12-17 08:57:59 +00:00
Florian Mayer
514580b438
[MTE] Apply alignment / size in AsmPrinter rather than IR (#111918)
This makes sure no optimizations are applied that assume the
bigger alignment or size, which could be incorrect if we link
together with non-instrumented code.
2024-12-17 00:47:02 -08:00
Florian Hahn
58cfa39861
[VPlan] Remove legacy VPlan() constructors (NFC).
The constructors were retained to reduce the diff during transition.

Remove them now.
2024-12-17 08:22:22 +00:00
Brendan Sweeney
bfe8a21bad
[RISCV][ISEL] Lowering to load-acquire/store-release for RISCV Zalasr (#82914)
Lowering to load-acquire/store-release for RISCV Zalasr.

Currently uses the psABI lowerings for WMO load-acquire/store-release
(which are identical to A.7). These are incompatable with the A.6
lowerings currently used by LLVM. This should be OK for now since Zalasr
is behind the enable experimental extensions flag, but needs to be fixed
before it is removed from that.

For TSO, it uses the standard Ztso mappings except for lowering seq_cst
loads/store to load-acquire/store-release, I had Andrea review that.
2024-12-17 00:19:45 -08:00
Lang Hames
300deebf41 [ORC] Make LazyReexportsManager implement ResourceManager.
This ensures that the reexports mappings are cleared when the resource tracker
associated with each mapping is removed.
2024-12-17 18:45:16 +11:00
Craig Topper
43ede46898 [RISCV][GISel] Add legalization for more fp128 libcalls. 2024-12-16 23:39:09 -08:00
Fangrui Song
495bd4c255
[llvm-mc] Don't print initial .text for disassembler
```
% echo 90 | llvm-mc -triple=x86_64 --disassemble --hex
	.text
        nop
```

The initial `.text` kludge is due `initSection`, which is actually only
needed by AIX XCOFF for its `getCurrentSectionOnly()` use in
MCAsmStreamer::emitInstruction (https://reviews.llvm.org/D95518). Adjust
MCAsmStreamer::emitInstruction to not trigger failures on

```
echo 7c4303a6 | llvm-mc --cdis --hex --triple=powerpc-aix-ibm-xcoff
```

Pull Request: https://github.com/llvm/llvm-project/pull/120185
2024-12-16 23:38:48 -08:00
Ryosuke Niwa
6db1b2035b
Revert "[Static analysis] Encodes a filename before inserting it into a URL." (#120195)
Reverts llvm/llvm-project#120123
Broke some tests.
2024-12-16 23:36:26 -08:00
Fangrui Song
a56ca3a4e4 [test] Don't test initial ".text" in llvm-mc --disassemble output
This kludge will go away after #120185.
2024-12-16 23:24:34 -08:00
Daniil Kovalev
417d2d7ce6
[PAC][lld][AArch64][ELF] Support signed GOT (#113815)
Depends on #113811

Support `R_AARCH64_AUTH_ADR_GOT_PAGE`, `R_AARCH64_AUTH_GOT_LO12_NC` and
`R_AARCH64_AUTH_GOT_ADD_LO12_NC` GOT-generating relocations. For preemptible
symbols, dynamic relocation `R_AARCH64_AUTH_GLOB_DAT` is emitted. Otherwise,
we unconditionally emit `R_AARCH64_AUTH_RELATIVE` dynamic relocation since
pointers in signed GOT needs to be signed during dynamic link time.
2024-12-17 10:23:01 +03:00
Ryosuke Niwa
f515d7aa72
[Static analysis] Encodes a filename before inserting it into a URL. (#120123)
This fixes a bug where report links generated from files such as
StylePrimitiveNumericTypes+Conversions.h in WebKit result in an error.

Co-authored-by: Brianna Fan <bfan2@apple.com>
2024-12-16 23:15:32 -08:00
Aiden Grossman
4a7673ddf2 [Github] Fix premerge concurrency cancellation
This should actually fix the problem as I validated that github.sha returns an
actual value by running a workflow in a test repo. I'm not sure why the
existing value doesn't work, but it returns nothing.
2024-12-17 06:49:35 +00:00
LLVM GN Syncbot
a5d00ae9d1 [gn build] Port b3d2548d5b04 2024-12-17 06:23:57 +00:00
Lang Hames
b3d2548d5b
[ORC] Introduce LinkGraphLayer interface and LinkGraphLinkingLayer. (#120182)
Introduces a new layer interface, LinkGraphLayer, that can be used to
add LinkGraphs to an ExecutionSession.

This patch moves most of ObjectLinkingLayer's functionality into a new
LinkGraphLinkingLayer which should (in the future) be able to be used
without linking libObject. ObjectLinkingLayer now inherits from
LinkGraphLinkingLayer and just handles conversion of object files to
LinkGraphs, which are then handed down to LinkGraphLinkingLayer to be
linked.
2024-12-17 17:18:58 +11:00
Fangrui Song
e83afbe793 [ELF] Remove unneeded sec->file check 2024-12-16 22:17:18 -08:00
Matt Arsenault
8387cbd0f9
AMDGPU: Delete spills of undef values (#119684)
AMDGPU: Delete spills of undef values

It would be a bit more logical to preserve the undef and do the normal
expansion, but this is less work. This avoids verifier errors in a
future patch which starts deleting liveness from registers after
allocation failures which results in spills of undef values.

https://reviews.llvm.org/D122607

Move where undef sgpr spills are deleted
2024-12-17 13:08:38 +07:00
Matt Arsenault
5e727e8bed
[Statepoint] Treat undef operands less specially (#119682)
This reverts commit f7443905af1e06eaacda1e437fff8d54dc89c487.

This is to avoid an assertion if an undef operand appears in a
stackmap. This is important to avoid hitting verifier errors
when register allocation starts adding undefs in error scenarios.

Rather than trying to treat undef operands as special, leave them
alone and avoid producing an invalid spill. It would a bit more
precise to produce a spill of an undef register here, but that's not
exposed through the storeRegToStackSlot API.

https://reviews.llvm.org/D122605

This was an alternative to https://reviews.llvm.org/D122582
2024-12-17 12:59:46 +07:00
Alexander Yermolovich
3c357a49d6
[BOLT] Add support for safe-icf (#116275)
Identical Code Folding (ICF) folds functions that are identical into one
function, and updates symbol addresses to the new address. This reduces
the size of a binary, but can lead to problems. For example when
function pointers are compared. This can be done either explicitly in
the code or generated IR by optimization passes like Indirect Call
Promotion (ICP). After ICF what used to be two different addresses
become the same address. This can lead to a different code path being
taken.

This is where safe ICF comes in. Linker (LLD) does it using address
significant section generated by clang. If symbol is in it, or an object
doesn't have this section symbols are not folded.

BOLT does not have the information regarding which objects do not have
this section, so can't re-use this mechanism.

This implementation scans code section and conservatively marks
functions symbols as unsafe. It treats symbols as unsafe if they are
used in non-control flow instruction. It also scans through the data
relocation sections and does the same for relocations that reference a
function symbol. The latter handles the case when function pointer is
stored in a local or global variable, etc. If a relocation address
points within a vtable these symbols are skipped.
2024-12-16 21:49:53 -08:00
Sirraide
eb5c21108f
[Clang] [Sema] Support matrix types in pseudo-destructor expressions (#117483)
We already support vector types, and since matrix element types have to
be scalar types, there should be no problem w/ just enabling this.

This now also allows matrix types to be stored in STL containers.
2024-12-17 06:49:31 +01:00
Matt Arsenault
e2cabd715b RegAllocGreedy: Fix comment typo 2024-12-17 12:46:04 +07:00
Timm Baeder
056cd12284
[clang][bytecode] Don't check returned pointers for liveness (#120107)
We're supposed to let them through and then later diagnose reading from
them, but returning dead pointers is fine.
2024-12-17 06:20:14 +01:00
Fangrui Song
c6ff809ae9
[llvm-mc] Add --hex to disassemble hex bytes
`--disassemble`/`--cdis` parses input bytes as decimal, 0bbin, 0ooct, or
0xhex. While the hexadecimal digit form is most commonly used, requiring
a 0x prefix for each byte (`0x48 0x29 0xc3`) is cumbersome.

Tools like xxd -p and rz-asm use a plain hex dump form without the 0x
prefix or space separator. This patch adds --hex to disassemble such hex
bytes with optional whitespace.

```
% rz-asm -a x86 -b 64 -d 4829c34829c4
sub rbx, rax
sub rsp, rax

% llvm-mc -triple=x86_64 --cdis --hex --output-asm-variant=1 <<< 4829c34829c4
        .text
        sub     rbx, rax
        sub     rsp, rax
```

Pull Request: https://github.com/llvm/llvm-project/pull/119992
2024-12-16 21:05:08 -08:00
Yifei Xu
e2a94a97bd
Update BUILD.bazel
Fix bazel build after https://github.com/llvm/llvm-project/pull/120116
2024-12-16 22:39:10 -06:00
Ivan R. Ivanov
2806705c4b
[MLIR][NVVM] Enable import of nvvm.barrier0 (#119965)
Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>
2024-12-17 13:07:28 +09:00
Koakuma
ad64946549
[SPARC][IAS] Add support for call dest, imm form (#119078)
This follows GCC behavior of allowing a trailing immediate, that is
ignored by the assembler.
2024-12-17 10:42:26 +07:00
Valentin Clement (バレンタイン クレメン)
5e1f87e849
[flang][cuda] Correctly allocate memory for descriptor load (#120164)
CodeGen will allocate memory for a new descriptor on descriptor loads.
CUDA Fortran local descriptor are allocated in managed memory by the
runtime. The newly allocated storage for cuda descriptor must also be
allocated through the runtime.
2024-12-16 19:12:05 -08:00
Luke Lau
fba3e069b4
[VPlan] Remove overlapping VPInstruction::mayWriteToMemory. NFCI (#120039)
VPInstruction has a definition of mayWriteToMemory, which seems to only
be used by VPlanSLP. However VPInstructions are already handled in
VPRecipeBase::mayWriteToMemory, and everywhere else seems to use this
definition. I think these should be the same for all intents and
purposes. The VPRecipeBase definition is more conservative but returns
true for stores/calls/invokes/SLPStores.
2024-12-17 11:02:55 +08:00
Teresa Johnson
bf700c39d1
[MemProf] Remove dead code (NFC) (#120156)
Remove unused collection of context size information that was likely
leftover from debugging / testing.
2024-12-16 17:15:25 -08:00
Aiden Grossman
f0878995c2 [Github] Fix concurrency groups for premerge
According to https://docs.github.com/en/rest/using-the-rest-api/github-event-types?apiVersion=2022-11-28,
When we look at the push event payload, github.event.push.head is a string
containing the SHA. This is currently causing new commits on main to cancel
the premerge pipeline of older commits.
2024-12-17 01:10:46 +00:00
Jie Fu
a1766699c6 [clang] Fix -Wunused-variable in CGBuiltin.cpp (NFC)
/llvm-project/clang/lib/CodeGen/CGBuiltin.cpp:19441:17:
 error: unused variable 'Ty' [-Werror,-Wunused-variable]
    llvm::Type *Ty = Op->getType();
                ^
1 error generated.
2024-12-17 09:07:44 +08:00