532980 Commits

Author SHA1 Message Date
Jerry-Ge
fcfbef5582
[mlir][tosa] Remove extra declarations of MulOperandsAndResultElementType in TosaOps.td (#134300)
Minor code cleanup

Signed-off-by: Jerry Ge <jerry.ge@arm.com>
2025-04-03 18:15:30 -07:00
Slava Zakharin
65b85bf8bc
[flang] Fixed driver link LIT test for PPC targets. (#134320)
After #131041, the F128 libraries are not linked for PPC targets even
when the driver is built with FLANG_RUNTIME_F128_MATH_LIB.
2025-04-03 16:58:11 -07:00
Michael Jones
c0079ba3dd
[libc] Make utimes_test more stable (#134321)
The test for utimes added in #134167 might fail if the file for one test
hasn't been cleaned up by the OS before the second test starts. This
patch makes the tests use different files.
2025-04-03 16:53:55 -07:00
Jorge Gorbe Moya
ee1ee1144a Fix unused variable warning in non-debug build after 7d3dfc862d283319d01997c0672c50b4a082aa4e (NFC) 2025-04-03 16:50:19 -07:00
Alex MacLean
ba0a52a04b
[InferAS] Support getAssumedAddrSpace for Arguments for NVPTX (#133991) 2025-04-03 16:47:36 -07:00
Aditya Tejpaul
d33ae41c62
[libc] Implemented utimes (Issue #133953) (#134167)
This pull request implements the `utimes` command in libc ([Issue
#133953](https://github.com/llvm/llvm-project/issues/133953)).

- [x] Add the implementation of `utimes` in `/src/sys/time`.
- [x] Add tests for `utimes` in `/test/src/sys/time`. 
- [x] Add `utimes` to
[entrypoints.txt](https://github.com/llvm/llvm-project/blob/main/libc/config/linux/x86_64/entrypoints.txt)
for at least x86_64 and whatever you're building on
- [x] Add `utimes` to
[include/sys/time.yaml](https://github.com/llvm/llvm-project/blob/main/libc/include/sys/time.yaml)
2025-04-03 16:19:12 -07:00
Ian Anderson
bd197ca003
[clang][modules] Determine if the SDK supports builtin modules independent of the target (#134005)
Whether the SDK supports builtin modules is a property of the SDK
itself, and really has nothing to do with the target. This was already
worked around for Mac Catalyst, but there are some other more esoteric
non-obvious target-to-sdk mappings that aren't handled. Have the SDK
parse its OS out of CanonicalName and use that instead of the target to
determine if builtin modules are supported.
2025-04-03 16:09:57 -07:00
modiking
9f2feeb189
[mlir][gpu][nvptx] Remove null terminator when outputting PTX (#133019)
PTX source files are expected to only contain ASCII text
(https://docs.nvidia.com/cuda/parallel-thread-execution/#source-format) and no null terminators.

`ptxas` has so far not enforced this but is moving towards doing so.
This revealed a problem where the null terminator is getting printed out
in the output file in MLIR path when outputting ptx directly. Only add the null on the assembly output path for JIT instead of in output of `moduleToObject `.
2025-04-03 15:50:54 -07:00
Jason Molenda
f1c6612202 [lldb][debugserver] Save and restore the SVE/SME register state (#134184)
debugserver isn't saving and restoring the SVE/SME register state around
inferior function calls.

Making arbitrary function calls while in Streaming SVE mode is generally
a poor idea because a NEON instruction can be hit and crash the
expression execution, which is how I missed this, but they should be
handled correctly if the user knows it is safe to do.

Re-landing this change after fixing an incorrect behavior on systems
without SME support.

rdar://146886210
2025-04-03 15:48:54 -07:00
Louis Dionne
2cd8edd1ff
[libc++] Add missing release note for LLVM 20 about zip_view (#134144)
We should have had a release note in LLVM 20 about implementing P2165R4
since that is technically an ABI and API break for zip_view. We don't
expect anyone to actually hit the ABI issue, but we've come across some
(fairly small) breakage due to the API change, so this should at least
be mentioned in the release notes.
2025-04-03 18:34:49 -04:00
Andre Kuhlenschmidt
b11eece1bb
[flang][intrinsics] Implement the time intrinsic (#133823)
This PR implements the nonstandard intrinsic time.

In addition to running the unit tests, I also double checked that the
example code works by manually compiling and running it.
2025-04-03 15:33:40 -07:00
Sumit Agarwal
996cf5dc67
[HLSL] Implement dot2add intrinsic (#131237)
Resolves #99221 
Key points: For SPIRV backend, it decompose into a `dot` followed a
`add`.

- [x] Implement dot2add clang builtin,
- [x] Link dot2add clang builtin with hlsl_intrinsics.h
- [x] Add sema checks for dot2add to CheckHLSLBuiltinFunctionCall in
SemaHLSL.cpp
- [x] Add codegen for dot2add to EmitHLSLBuiltinExpr in CGBuiltin.cpp
- [x] Add codegen tests to clang/test/CodeGenHLSL/builtins/dot2add.hlsl
- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dot2add-errors.hlsl
- [x] Create the int_dx_dot2add intrinsic in IntrinsicsDirectX.td
- [x] Create the DXILOpMapping of int_dx_dot2add to 162 in DXIL.td
- [x] Create the dot2add.ll and dot2add_errors.ll tests in
llvm/test/CodeGen/DirectX/
2025-04-03 16:23:09 -06:00
Jorge Gorbe Moya
109566a3d0
[bazel] Fold "${Target}Analysis" targets into their respective CodeGen targets. (#134312)
After 3801bf6164f570a145e3ebd20cf9114782ae0329, SPIRVAnalysis needs to
include SPIRV.h provided by SPIRVCodegen, but the CodeGen target already
depends on Analysis, so that would cause a circular dependency.

Analysis is a subdirectory of CodeGen so it makes sense as a part of the
main CodeGen target too.
2025-04-03 15:21:26 -07:00
Jan Svoboda
506630d6db
[clang][deps] Avoid unchecked error assertion (#134284) 2025-04-03 14:57:06 -07:00
Andre Kuhlenschmidt
85fdab33b0
[flang][intrinsic] add nonstandard intrinsic unlink (#134162)
This PR adds the intrinsic `unlink` to flang. 

## Test plan
- Added two codegen unit tests and ensured flang-check continues to
pass.
- Manually compiled and ran the example from the documentation.
2025-04-03 14:33:53 -07:00
Valentin Clement (バレンタイン クレメン)
fb6f60ddc5
[flang][cuda][NFC] Use NVVM VoteBallotOp (#134307)
`llvm.nvvm.vote.ballot.sync` has its own operation so use it in
lowering.
2025-04-03 14:19:31 -07:00
Valentin Clement (バレンタイン クレメン)
de40f6101d
[flang][cuda][NFC] Use NVVM op for match all (#134303) 2025-04-03 14:19:21 -07:00
Florian Hahn
0f696c2e86
[LV] Add test where epilogue is vectorized and backedge removed.
Adds extra test coverage for
https://github.com/llvm/llvm-project/pull/106748.
2025-04-03 22:14:15 +01:00
Andy Kaylor
a06ae976dc
[CIR] Upstream support for promoted types with unary plus/minus (#133829)
The initial upstreaming of unary operations left promoted types
unhandled for the unary plus and minus operators. This change implements
support for promoted types and performs a bit of related code cleanup.
2025-04-03 14:04:32 -07:00
Andy Kaylor
13aac46332
[clang][NFC] Refactor CodeGen's hasBooleanRepresentation (#134159)
The ClangIR upstreaming project needs the same logic for
hasBooleanRepresentation() that is currently implemented in the standard
clang codegen. In order to share this code, this change moves the
implementation of this function into the AST Type class.

No functional change is intended by this change. The ClangIR use of this
function will be added separately in a later change.
2025-04-03 14:03:25 -07:00
Alexander Yermolovich
4f902d2425
[llvm-dwarfdump] Make --verify for .debug_names multithreaded. (#127281)
This PR makes verification of .debug_names acceleration table
multithreaded. In local testing it improves verification of clang
.debug_names from four minutes to under a minute.
This PR relies on a current mechanism of extracting DIEs into a vector. 
Future improvements can include creating API to extract one DIE at a
time, or grouping Entires into buckets by CUs and extracting before
parallel step.

Single Thread
4:12.37 real,   246.88 user,    3.54 sys,       0 amem,10232004 mmem
Multi Thread
0:49.40 real,   612.84 user,    515.73 sys,     0 amem, 11226292 mmem
2025-04-03 14:02:27 -07:00
Henry Jiang
7d3dfc862d
[JITLink][XCOFF] Setup initial build support for XCOFF (#127266)
This patch starts the initial implementation of JITLink for XCOFF (Object format for AIX).
2025-04-03 17:01:18 -04:00
Jonas Devlieghere
5f99e0d4b9
[lldb] Use the "reverse video" effect when colors are disabled. (#134203)
When you run lldb without colors (`-X`), the status line looks weird
because it doesn't have a background. You end up with what appears to be
floating text at the bottom of your terminal.

This patch changes the statusline to use the reverse video effect, even
when colors are off. The effect doesn't introduce any new colors and
just inverts the foreground and background color.

I considered an alternative approach which changes the behavior of the
`-X` option, so that turning off colors doesn't prevent emitting
non-color related control characters such as bold, underline, and
reverse video. I decided to go with this more targeted fix as (1) nobody
is asking for this more general change and (2) it introduces significant
complexity to plumb this through using a setting and driver flag so that
it can be disabled when running the tests.

Fixes #134112.
2025-04-03 13:51:17 -07:00
Florian Hahn
cdff7f0b6e
[LV] Retrieve middle VPBB via scalar ph to fix epilogue resumephis (NFC)
If ScalarPH has predecessors, we may need to update its reduction resume
values. If there is a middle block, it must be the first predecessor.
Note that the first predecessor may not be the middle block, if the
middle block doesn't branch to the scalar preheader. In that case,
fixReductionScalarResumeWhenVectorizingEpilog will be a no-op.

In preparation for https://github.com/llvm/llvm-project/pull/106748.
2025-04-03 21:46:48 +01:00
Mircea Trofin
61768b3528
[ctxprof] Don't import roots elsewhere (#134012)
Block a context root from being imported by its callers. 

Suppose that happened. Its caller - usually a message pump - inlines its copy of the root. Then it (the root) and whatever it calls will be the non-contextually optimized callee versions.
2025-04-03 13:21:39 -07:00
Hristo Hristov
b93376f899
[libc++][type_traits] reference_{constructs|converts}_from_temporary with -Winvalid-specialization tests (#133946)
Addresses comment:
https://github.com/llvm/llvm-project/pull/128649/files#r2022341035

---------

Co-authored-by: Hristo Hristov <zingam@outlook.com>
2025-04-03 23:18:04 +03:00
Alexey Bataev
daab7d0807 [SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-03 13:17:40 -07:00
Florian Hahn
012e574d4d
[LV] Add FindLastIV test with truncated IV and epilogue vectorization.
This adds missing test coverage for
https://github.com/llvm/llvm-project/pull/132691.
2025-04-03 21:01:58 +01:00
Alexey Bataev
7c4013d591 Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved"
This reverts commit 0bec0f5c059af5f920fe22ecda469b666b5971b0 to fix
a crash reported in https://lab.llvm.org/buildbot/#/builders/143/builds/6668.
2025-04-03 12:58:49 -07:00
zcfh
229ca7dbcb
[memprof] Report an error when buildid and profile do not match (#132504)
## Problem
When the build ids of the profile and binary do not match, the error
reported by llvm-profdata is `no entries in callstack map after
symbolization`, but the root cause of this problem is the **build id
mismatch**.
## Trigger scenario
For example, when performing `memprof` optimization on `clang`,
`rawprofile` is collected through `ninja clang`. In addition to running
clang, some other programs will also be executed, and these programs
will also generate rawprofile. When `no entries in callstack map after
symbolization` appears during `llvm-profdata merge`, users may
mistakenly think that the **instrumentation failed or other reasons**,
and will **not directly realize that the binary and profile do not
match**.

## Changed
Currently, when the build id does not match, an assert error is
triggered only in debug mode. Change it to directly return an error when
the build id does not match.
2025-04-03 12:48:27 -07:00
Valentin Clement (バレンタイン クレメン)
7288f1bc32
[flang][cuda] Use nvvm operation for match any (#134283)
The string used for intrinsic was not the correct one
"llvm.nvvm.match.any.sync.i32p". There was an extra `p` at the end.

Use the NVVM operation instead so we don't duplicate it.
2025-04-03 12:08:30 -07:00
Rahul Joshi
b393ca6026
[NFC][LLVM][RISCV] Cleanup pass initialization for RISCV (#134279)
- Move calls to pass initialization functions to RISCV target
initialization and remove them from pass constructors.
2025-04-03 11:28:45 -07:00
Jorge Gorbe Moya
158684a80f [bazel] Add missing dep after 586c5e3083428e7473e880dafd5939e8707bc1c9 2025-04-03 11:25:44 -07:00
Slava Zakharin
b8b752db2b
[flang][NFC] Create required Source dir for flang-doc. (#134000) 2025-04-03 10:43:49 -07:00
Slava Zakharin
3f6ae3f0a8
[flang] Added driver options for arrays repacking. (#134002)
Added options:
  * -f[no-]repack-arrays
  * -f[no-]stack-repack-arrays
  * -frepack-arrays-contiguity=whole/innermost
2025-04-03 10:43:28 -07:00
Valentin Clement (バレンタイン クレメン)
3e59ff27e5
[flang][cuda] Fix pred type for vote functions (#134166) 2025-04-03 10:33:09 -07:00
Matheus Izvekov
cfee056b4e
[clang] NFC: introduce UnsignedOrNone as a replacement for std::optional<unsigned> (#134142)
This introduces a new class 'UnsignedOrNone', which models a lite
version of `std::optional<unsigned>`, but has the same size as
'unsigned'.

This replaces most uses of `std::optional<unsigned>`, and similar
schemes utilizing 'int' and '-1' as sentinel.

Besides the smaller size advantage, this is simpler to serialize, as its
internal representation is a single unsigned int as well.
2025-04-03 14:27:18 -03:00
Amr Hesham
262b9b5153
[CIR][Upstream] Local initialization for ArrayType (#132974)
This change adds local initialization for ArrayType

Issue #130197
2025-04-03 19:25:25 +02:00
Sterling-Augustine
7514225052
Use a more proper idiom for "the output file doesn't matter". NFC. (#134280)
As in the description. Follow up to PR #134179.
2025-04-03 10:24:10 -07:00
zhijian lin
1a540c3b8b
[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#133155)
ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated,
using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO,
UADDO_CARRY, USUBO, USUBO_CARRY in the patch.
2025-04-03 13:22:49 -04:00
Alexey Bataev
0bec0f5c05
[SLP]Initial support for (masked)loads + compress and (masked)interleaved
Added initial support for (masked)loads + compress and
(masked)interleaved loads.

Reviewers: RKSimon, hiraditya

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/132099
2025-04-03 13:21:22 -04:00
Austin Schuh
2abcdd8cf0
[CUDA] Add support for CUDA surfaces (#132883)
This adds support for all the surface read and write calls to clang. It
extends the pattern used for textures to surfaces too.

I tested this by generating all the various permutations of the calls
and argument types in a python script, compiling them with both clang
and nvcc, and comparing the generated ptx for equivilence. They all
agree, ignoring register allocation, and some places where Clang picks
different memory write instructions. An example kernel is:

```
__global__ void testKernel(cudaSurfaceObject_t surfObj, int x, float2* result) {
    *result = surf1Dread<float2>(surfObj, x, cudaBoundaryModeZero);
}
```

---------

Signed-off-by: Austin Schuh <austin.linux@gmail.com>
2025-04-03 10:08:02 -07:00
Luke Lau
9a5b0f302b
Reapply "[InstCombine] Match scalable splats in m_ImmConstant (#132522)" (#134262)
This reapplies #132522.

Previously casts of scalable m_ImmConstant splats weren't being folded
by ConstantFoldCastOperand, triggering the "Constant-fold of ImmConstant
should not fail" assertion.

There are no changes to the code in this PR, instead we just needed
#133207 to land first.

A test has been added for the assertion in
llvm/test/Transforms/InstSimplify/vec-icmp-of-cast.ll
@icmp_ult_sext_scalable_splat_is_true.

<hr/>

#118806 fixed an infinite loop in FoldShiftByConstant that could occur
when the shift amount was a ConstantExpr.

However this meant that FoldShiftByConstant no longer kicked in for
scalable vectors because scalable splats are represented by
ConstantExprs.

This fixes it by allowing scalable splats of non-ConstantExprs in
m_ImmConstant, which also fixes a few other test cases where scalable
splats were being missed.

But I'm also hoping that UseConstantIntForScalableSplat will eventually
remove the need for this.

I noticed this when trying to reverse a combine on RISC-V in #132245,
and saw that the resulting vector and scalar forms were different.
2025-04-03 18:03:16 +01:00
Matt Arsenault
a54736afd5
CloneFunction: Do not delete blocks with address taken (#134209)
If a block with a single predecessor also had its address taken,
it was getting deleted in this post-inline cleanup step. This would
result in the blockaddress in the resulting function getting deleted
and replaced with inttoptr 1.

This fixes one bug required to permit inlining of functions with blockaddress 
uses.

At the moment this is not testable (at least without an annoyingly complex
unit test),  and is a pre-bug fix for future patches. Functions with
blockaddress uses are rejected in isInlineViable, so we don't get this far
with the current InlineFunction uses (some of the existing cases seem to
reproduce this part of the rejection logic, like PartialInliner). This
will be tested in a pending llvm-reduce change.

Prerequisite for #38908
2025-04-03 23:52:25 +07:00
Christian Sigg
6ddf7cf780
[mlir][bazel] Allow gentbl_cc_library(tbl_outs) to be a dict. (#134271)
This makes the BUILD file shorter and more readable.
I will follow up with converting the other instances.
2025-04-03 18:47:56 +02:00
MaheshRavishankar
a1bc979aa8
[mlir][Bufferization] Do not have read semantics for destination of tensor.parallel_insert_slice. (#134169)
`tensor.insert_slice` needs to have read semantics on its destination
operand. Since it has a return value, its semantics are

- Copy dest to result
- Copy source to subview of destination.

`tensor.parallel_insert_slice` though has no result. So it does not need
to have read semantics. The op description
[here](a3ac318e5f/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td (L1524))
also says that it is expected to lower to a `memref.subview`, that does
not have read semantics on the destination (its just a view).

This patch drops the read semantics for destination of
`tensor.parallel_insert_slice` but also makes the `shared_outs` operands
of `scf.forall` have read semantics. Earlier it would rely indirectly on
read semantics of destination operand of `tensor.parallel_insert_slice`
to propagate the read semantics for `shared_outs`. Now that is specified
more directly.

Fixes #133964

---------

Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>
2025-04-03 09:47:36 -07:00
John Harrison
bc6cd825ec
[lldb-dap] Creating a common configuration structure for launch and attach requests. (#133960)
This moves all the common settings of the launch and attach operations
into the `lldb_dap::protocol::Configuration`. These common settings
can be in both `launch` and `attach` requests and allows us to isolate
the DAP configuration operations into a single common location.

This is split out from #133624.
2025-04-03 09:45:00 -07:00
Finn Plummer
73e8d67a20
Revert "[HLSL][RootSignature] Define and integrate HLSLRootSignatureAttr" (#134273)
Reverts llvm/llvm-project#134124

The build is failing again to a linking error:
[here](https://github.com/llvm/llvm-project/pull/134124#issuecomment-2776370486).
Again the error was not present locally or any of the pre-merge builds
and must have been transitively linked in these build environments...
2025-04-03 09:40:50 -07:00
Simon Pilgrim
2190808f5d
[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV/VPERMV3 nodes if the upper elements are not demanded (REAPPLIED) (#134263)
With AVX512VL targets, use 128/256-bit VPERMV/VPERMV3 nodes when we only need the lower elements.

Reapplied version of #133923 with fix for typo in the VPERMV3 mask adjustment
2025-04-03 17:39:38 +01:00
Connector Switch
b738b82699
[libc] Combine the function prototype int (*compar)(const void *, const void *) (#134238)
Closes #134118.
2025-04-04 00:36:23 +08:00