520563 Commits

Author SHA1 Message Date
zhijian lin
6b5c67bd16
[PowerPC][Backend] using signed extend value instead of zero extend value for isIntS34Immediate() (#118703)
The patch fix the issue
https://github.com/llvm/llvm-project/issues/118695
2024-12-05 09:08:18 -05:00
Jay Foad
f9f7c42ca6
[AMDGPU] Refine AMDGPULateCodeGenPrepare class. NFC. (#118792)
Use references instead of pointers for most state and initialize it all
in the constructor, and similarly for the LiveRegOptimizer class.
2024-12-05 14:05:51 +00:00
Nikita Popov
bb03a18470 [SCCP] Regenerate test checks (NFC)
The checks generated by old UTC version fail on this test due to
missing signature matching, so regenerate it with a newer one.
2024-12-05 14:55:29 +01:00
Matthias Gehre
1f932825f9
[MLIR][EmitC] arith-to-emitc: Fix lowering of fptoui (#118504)
`arith.fptoui %arg0 : f32 to i16` was lowered to
```
%0 = emitc.cast %arg0 : f32 to ui32
emitc.cast %0 : ui32 to i16
```
and is now lowered to
```
%0 = emitc.cast %arg0 : f32 to ui16
emitc.cast %0 : ui16 to i16
```
2024-12-05 14:50:35 +01:00
Tyler Nowicki
8e66344448
[Support] Use macro var args to allow templates within DEBUG_WITH_TYPE (#117614)
Use variadic args with DEBUG_WITH_TYPE("name", ...) macros to resolve a
compilation failure that occurs when using a comma within the last macro
argument. Commas come up when instantiating templates such as
SmallMapVector that require multiple template args.
2024-12-05 08:48:42 -05:00
Nikita Popov
462cb3cd6c
[InstCombine] Infer nusw + nneg -> nuw for getelementptr (#111144)
If the gep is nusw (usually via inbounds) and the offset is
non-negative, we can infer nuw.

Proof: https://alive2.llvm.org/ce/z/ihztLy
2024-12-05 14:36:40 +01:00
Kai Nacke
f85be32613
[SystemZ] SIMM32 is a signed constant (#118634)
A follow-up to PR #117181: SIMM32 must use getSignedTargetConstant(),
too.
2024-12-05 08:35:27 -05:00
jeanPerier
ff78cd5f3d
[flang] fix private pointers and default initialized variables (#118494)
Both OpenMP privatization and DO CONCURRENT LOCAL lowering was incorrect
for pointers and derived type with default initialization.

For pointers, the descriptor was not established with the rank/type
code/element size, leading to undefined behavior if any inquiry was made
to it prior to a pointer assignment (and if/when using the runtime for
pointer assignments, the descriptor must have been established).

For derived type with default initialization, the copies were not
default initialized.
2024-12-05 14:09:48 +01:00
Florian Hahn
ffb1c21bd4
[Matrix] Fix crash in liftTranspose when instructions are folded.
Builder.Create(F)Add may constant fold the inputs, return a constant
instead of an instruction. Account for that instead of crashing.
2024-12-05 12:57:54 +00:00
Krzysztof Parzyszek
da6099c9ad
[flang][test] Recognize !$acc and !$omp spelled with capital letters (#118666)
If there are any continuation lines in the source, they will be printed
by the unparser with capital letters (at least in case of OpenMP). To
avoid having them stripped out, recognize their spellings using capital
letters as well.

---------

Co-authored-by: Michael Kruse <github@meinersbur.de>
2024-12-05 06:44:38 -06:00
Zahira Ammarguellat
44433147d6
[NFC] Fix uninitialized scalar field in constructor. (#118324)
Non-static class field is not initialized in constructor.
2024-12-05 07:42:02 -05:00
Pengcheng Wang
db9057edca
[Sched] Skip MemOp with unknown size when clustering (#118443)
In #83875, we changed the type of `Width` to `LocationSize`. To get
the clsuter bytes, we use `LocationSize::getValue()` to calculate
the value.

But when `Width` is an unknown size `LocationSize`, an assertion
"Getting value from an unknown LocationSize!" will be triggered.

This patch simply skips MemOp with unknown size to fix this issue
and keep the logic the same as before.

This issue was found when implementing software pipeliner for
RISC-V in #117546. The pipeliner may clone some memory operations
with `BeforeOrAfterPointer` size.
2024-12-05 20:14:58 +08:00
Timm Baeder
b6217f67a4
[clang][bytecode] Fix bitcasting from null pointers (#116999) 2024-12-05 13:13:59 +01:00
Jacek Caban
71bbafba31
[LLD][COFF] Add basic ARM64X dynamic relocations support (#118035)
This modifies the machine field in the hybrid view to be AMD64, aligning
it with expectations from ARM64EC modules. While this provides initial
support, additional relocations will be necessary for full
functionality. Many of these cases depend on implementing separate
namespace support first.

Move clearing of the .reloc section from addBaserels to assignAddresses
to ensure it is always cleared, regardless of the relocatable
configuration. This change also clarifies the reasoning for adding the
dynamic relocations chunk in that location.
2024-12-05 13:07:41 +01:00
Yingwei Zheng
59720dc703
[InstCombine] Fold icmp spred (X *nsw Z), (Y *nsw Z) -> icmp pred Z, 0 if scmp(X, Y) is known (#118726)
```
icmp spred (X *nsw Z), (Y *nsw Z) -> icmp swap(spred) Z, 0 if X s< Y
icmp spred (X *nsw Z), (Y *nsw Z) -> icmp spred       Z, 0 if X s> Y
```
Alive2: https://alive2.llvm.org/ce/z/F2D0GE
2024-12-05 19:59:31 +08:00
Nikita Popov
3740fac0d4 Revert "[clang-format] Add cmake target clang-format-style-options for updating ClangFormatStyleOptions.rst (#111513)"
Breaks the build when docs are not enabled.

This reverts commit f7560ee97b7441eb3f5b2d0744aad857fafa5855.
This reverts commit 6bec1806c9cc90f6e72fc04698f4221c86c5f95e.
2024-12-05 12:45:44 +01:00
Simon Pilgrim
dd7a3d4d79 [X86] Extend #118680 - support f16/bf16 fabs/fneg load-store patterns 2024-12-05 10:31:56 +00:00
Simon Pilgrim
ed9915ffdf [X86] fp16-libcalls.ll - regenerate test checks with vpternlog comments 2024-12-05 10:31:56 +00:00
Simon Pilgrim
722a568432 [X86] Add test coverage for f16/bf16 fabs/fneg load-store tests
Future extension to #118680
2024-12-05 10:31:55 +00:00
Michael Kruse
0cda970ecc
[Flang][NFC] Split common headers to reduce dependencies. (#110244)
Fortran.h and target.h are defining symbols where some are used by both, the Fortran runtime (Flang-RT) and Fortran compiler (Flang), and others are used by Flang only. With the upcoming refactoring of the Fortran runtime into its own subproject (#110217), move the declarations that are used by both into new headers to minimize the amount of code that will need to be shared by Flang-RT and Flang.

Details:

 * `Fortran.h`: Flang-RT  only uses some enum definitions out of this file, but not `AsFortran` which is defined in `Fortran.cpp`. Moving the enums into `Fortran-consts.h` allows keeping `Fortran.cpp` within Flang.

 * `target.h`: Contains some floating-point definitions that is used by the non-GTest unittests in `fp-testing.h`. Flang-RT also uses some non-GTest as well. Moving those definitions avoids the dependence on the entire FortranEvaluate library.
2024-12-05 11:29:32 +01:00
Timm Bäder
17dfdd3a86 [clang][bytecode][tests] Specify triple in bitfields tests
This still breaks on 32bit hosts otherwise.
See https://github.com/llvm/llvm-project/pull/116843
2024-12-05 11:10:58 +01:00
Sam Elliott
65ced158e9
[RISCV] Remove R_RISCV_RVC_LUI Relocation (#118714)
This was removed from the ABI in riscv-non-isa/riscv-elf-psabi-doc#398.
It is not emitted by LLVM, and seems to have been an internal
implementation detail in binutils.

This is a follow-up to 26ec5da744b8 which removed previous binutils
internal relocations when they were removed from the ABI.

The LLD implementation was not tested when it was added in
https://reviews.llvm.org/D39322
2024-12-05 10:10:27 +00:00
Florian Hahn
0772a0bd29
Revert "[memprof] Update YAML traits for writer purposes (#118720)"
This reverts commit 7b8cf147addf7d3fb4630475c40153226f5fdbd0.

Breaks building on macOS
    https://lab.llvm.org/buildbot/#/builders/190/builds/10737
    https://lab.llvm.org/buildbot/#/builders/23/builds/5491
    https://green.lab.llvm.org/job/llvm.org/job/clang-stage1-cmake-RA-incremental/6076/
2024-12-05 09:54:07 +00:00
WANG Rui
487a070beb [LoongArch][NFC] Pre-commit tests for sign-extension removal with div32 enabled 2024-12-05 17:40:28 +08:00
Pavel Labath
15de77db91
[lldb] (Prepare to) speed up dwarf indexing (#118657)
Indexing a single DWARF unit is a fairly small task, which means the
overhead of enqueueing a task for each unit is not negligible (mainly
because introduces a lot of synchronization points for queue management,
memory allocation etc.). This is particularly true if the binary was
built with type units, as these are usually very small.

This essentially brings us back to the state before
https://reviews.llvm.org/D78337, but the new implementation is built on
the llvm ThreadPool, and I've added a small improvement -- we now
construct one "index set" per thread instead of one per unit, which
should lower the memory usage (fewer small allocations) and make the
subsequent merge step faster.

On its own this patch doesn't actually change the performance
characteristics because we still have one choke point -- progress
reporting. I'm leaving that for a separate patch, but I've tried that
simply removing the progress reporting gives us about a 30-60% speed
boost.
2024-12-05 10:38:29 +01:00
Simon Pilgrim
6caf9f8236
[X86] combineStore - fold scalar float store(fabs/fneg(load())) -> store(and/xor(load(),c)) (#118680)
As noted on #117557 - its not worth performing scalar float fabs/fneg on the fpu if we're not doing any other fp ops.

This is currently limited to store + load pairs - I could try to extend this further if necessary, but we need to be careful that we don't end up in an infinite loop with the DAGCombiner foldBitcastedFPLogic combine.

Fixes #117557
2024-12-05 09:36:21 +00:00
Yuanqiang Liu
2e51e150e1
[MLIR][Python] enhance python ir printing with pringing flags (#117836)
Close https://github.com/llvm/llvm-project/pull/65854
2024-12-05 10:31:04 +01:00
Andrzej Warzyński
a2acb2ff8b
[mlir][linalg] Fix vectorization of tensor.extract (#118105)
The example below demonstrates a "scalar read followed by a broadcast"
pattern for `tensor.extract`:

```mlir
 #map = affine_map<(d0, d1, d2) -> (d0, d1, d2)>
 func.func @scalar_broadcast(
    %init : tensor<1x1x3xi32>,
    %src: tensor<1x3x2x4xi32>,
    %idx :index) -> tensor<1x1x3xi32> {

   %c0 = arith.constant 0 :index

   %res = linalg.generic {
     indexing_maps = [#map],
     iterator_types = ["parallel", "parallel", "parallel"]}
     outs(%init : tensor<1x1x3xi32>) {
     ^bb0(%out: i32):
       %val = tensor.extract %src[%idx, %idx, %idx, %idx] : tensor<1x3x2x4xi32>
       linalg.yield %val : i32
   } -> tensor<1x1x3xi32>

   return %res : tensor<1x1x3xi32>
}
```

The default masking path within the Linalg vectorizer, which assumes an
identity masking map, is not suitable here. Indeed:

* identity != broadcast.

This patch ensures masking is handled in the `vectorizeTensorExtract`
hook, which has the necessary context for proper handling.

Fixes #116197
2024-12-05 09:24:53 +00:00
Kadir Cetinkaya
c7ef0ac9fd
[clangd] Drop required attributes from ContainedRef protos
Per https://protobuf.dev/programming-guides/dos-donts/#add-required this
is discouraged and we already handle errors when marshalling protos.
This also ensures new message types are consistent with the rest in the
file.
2024-12-05 10:18:19 +01:00
Nathan Ridge
61fe67a401
[clangd] support outgoing calls in call hierarchy (#117673)
This reverts commit ce0f11325e0c62c5b81391589e9b93b412a85bc1.
2024-12-05 10:10:42 +01:00
Christian Kandeler
5d38a3406b
[clangd] Consolidate two functions converting index to LSP locations (#117885) 2024-12-05 10:02:59 +01:00
serge-sans-paille
3a8ada67ff
[clang][NFC] Fix miscellaneous typos in release notes 2024-12-05 09:59:57 +01:00
Benjamin Maxwell
a9eb8f0e3d
[mlir][ArmSME] Fix crash on empty vector.mask in arm-sme-vector-legalization (#118613)
Fixes #118449
2024-12-05 08:48:30 +00:00
Daniil Kovalev
41cde465ac
[PAC][Driver] Add -faarch64-jump-table-hardening flag (#113149)
The flag is placed together with pointer authentication flags since they
serve the same security purpose of protecting against attacks on control
flow. The flag is not ABI-affecting and might be enabled separately if
needed, but it's also intended to be enabled as part of pauth-enabled
environments (e.g. pauthtest).

See also codegen implementation #97666.
2024-12-05 11:34:29 +03:00
Callum Fare
fd3907ccb5
Reland #118503: [Offload] Introduce offload-tblgen and initial new API implementation (#118614)
Reland #118503. Added a fix for builds with `-DBUILD_SHARED_LIBS=ON`
(see last commit). Otherwise the changes are identical.

---


### New API

Previous discussions at the LLVM/Offload meeting have brought up the
need for a new API for exposing the functionality of the plugins. This
change introduces a very small subset of a new API, which is primarily
for testing the offload tooling and demonstrating how a new API can fit
into the existing code base without being too disruptive. Exact designs
for these entry points and future additions can be worked out over time.

The new API does however introduce the bare minimum functionality to
implement device discovery for Unified Runtime and SYCL. This means that
the `urinfo` and `sycl-ls` tools can be used on top of Offload. A
(rough) implementation of a Unified Runtime adapter (aka plugin) for
Offload is available
[here](https://github.com/callumfare/unified-runtime/tree/offload_adapter).
Our intention is to maintain this and use it to implement and test
Offload API changes with SYCL.

### Demoing the new API

```sh
# From the runtime build directory
$ ninja LibomptUnitTests
$ OFFLOAD_TRACE=1 ./offload/unittests/OffloadAPI/offload.unittests 
```


### Open questions and future work
* Only some of the available device info is exposed, and not all the
possible device queries needed for SYCL are implemented by the plugins.
A sensible next step would be to refactor and extend the existing device
info queries in the plugins. The existing info queries are all strings,
but the new API introduces the ability to return any arbitrary type.
* It may be sensible at some point for the plugins to implement the new
API directly, and the higher level code on top of it could be made
generic, but this is more of a long-term possibility.
2024-12-05 09:34:04 +01:00
Feng Zou
636beb6a28
[X86][LLD] Handle R_X86_64_CODE_6_GOTTPOFF relocation type (#117675)
For

    add %reg1, name@GOTTPOFF(%rip), %reg2
    add name@GOTTPOFF(%rip), %reg1, %reg2
    {nf} add %reg1, name@GOTTPOFF(%rip), %reg2
    {nf} add name@GOTTPOFF(%rip), %reg1, %reg2
    {nf} add name@GOTTPOFF(%rip), %reg

add

    R_X86_64_CODE_6_GOTTPOFF = 50

in #117277.

Linker can treat R_X86_64_CODE_6_GOTTPOFF as R_X86_64_GOTTPOFF or
convert the instructions above to

    add $name@tpoff, %reg1, %reg2
    add $name@tpoff, %reg1, %reg2
    {nf} add $name@tpoff, %reg1, %reg2
    {nf} add $name@tpoff, %reg1, %reg2
    {nf} add $name@tpoff, %reg

if the first byte of the instruction at the relocation offset - 6 is
0x62 (namely, encoded w/EVEX prefix) when possible.

Binutils patch: bminor/binutils-gdb@5bc71c2
Binutils mailthread:
https://sourceware.org/pipermail/binutils/2024-February/132351.html
ABI discussion:
https://groups.google.com/g/x86-64-abi/c/FhEZjCtDLFw/m/VHDjN4orAgAJ
Blog: https://kanrobert.github.io/rfc/All-about-APX-relocation
2024-12-05 16:26:26 +08:00
Aiden Grossman
a9a4a83b61
[clang-format] Add test to ensure formatting options docs are updated (#118154)
This patch adds a lit test to clang format to ensure that the
ClangFormatStyleOptions doc page has been updated appropriately. The
test just runs the automatic update script and diffs the outputs to
ensure they are the same.
2024-12-04 23:41:12 -08:00
Owen Pan
6bec1806c9 [clang-format] Add plurals.txt to DEPENDS of style_options_depends 2024-12-04 23:02:02 -08:00
Iuri Chaer
f7560ee97b
[clang-format] Add cmake target clang-format-style-options for updating ClangFormatStyleOptions.rst (#111513)
* Create a new `clang-format-style-options` build target which
re-generates ClangFormatStyleOptions.rst from its source header files.

As discussed in
https://github.com/llvm/llvm-project/pull/96804#discussion_r1718407404

---------

Co-authored-by: Owen Pan <owenpiano@gmail.com>
2024-12-04 22:50:01 -08:00
Thorsten Schütt
71ac1eb509
Revert "[GlobalISel] Combine [s,z]ext of undef into 0" (#118746)
Reverts llvm/llvm-project#117439
2024-12-05 07:48:20 +01:00
Renat Idrisov
0629e9e352
[MLIR] Removing dead values for branches (#117501)
Fixing RemoveDeadValues to properly remove arguments from
BranchOpInterface operations.
This is a follow-up for:
https://github.com/llvm/llvm-project/pull/117405
cc: @joker-eph @codemzs

---------

Co-authored-by: Renat Idrisov <parsifal-47@users.noreply.github.com>
2024-12-05 14:05:48 +08:00
Timm Baeder
abc27039be
[clang][bytecode] Pass __builtin_memcpy size along (#118649)
To DoBitCastPtr, so we know how many bytes we want to read.
2024-12-05 06:55:18 +01:00
Craig Topper
3e0e1c13ce
[RISCV][GISel] Support fp128 arithmetic and conversion for RV64. (#118707)
We can support these via libcalls in libgcc/compiler-rt or integer
operations for fneg/fabs/fcopysign. fp128 values will be passed in two
64-bit GPRs according to the psABI.

Supporting RV32 requires sret which is not supported by libcall handling
in LegalizerHelper.cpp yet. It doesn't call canLowerReturn.
2024-12-04 21:43:29 -08:00
Ben Shi
dba0861cd7
[AVR] Simplify eocoding of load/store instructions (#118279)
Fixes https://github.com/llvm/llvm-project/issues/113774
2024-12-05 13:05:25 +08:00
Vladimir Vereschaka
a996a15b4c
[CMake] Allow parametrizing of the static libraries in Cross ARM CMake cache. NFC. (#118737)
In order to support the cross-arm remote tests for LLDB project

(see 'lldb-remote-linux-*' public builders for details).
2024-12-04 21:04:29 -08:00
Timm Baeder
44be794658
[clang][bytecode] Not all null pointers are 0 (#118601)
Get the Value from the ASTContext instead.
2024-12-05 06:03:50 +01:00
Kareem Ergawy
0993335134
[OpenMP][OMPIRBuilder] Add delayed privatization support for wsloop (#118463)
Extend MLIR to LLVM lowering by adding support for `omp.wsloop` for
delayed privatization. This also refactors a few bit of code to isolate
the logic needed for `firstprivate` initialization in a shared util that
can be used across constructs that need it. The same is done for
`dealloc`
regions.

Parent PR: https://github.com/llvm/llvm-project/pull/118447. Only latest
commit is relevant for this PR.
2024-12-05 05:59:52 +01:00
Kazu Hirata
50f8580e2c
[memprof] Add IndexedMemProfData::addFrame (#118724)
This patch adds a helper function to replace an idiom like:

  FrameId Id = F.hash();
  MemProfData.Frames.try_emplace(Id, F);
  // Do something with Id.
2024-12-04 20:33:35 -08:00
Kareem Ergawy
7f72d71de7
[OpenMP][OMPIRBuilder] Refactor reduction initialization logic into one util (#118447)
This refactors the logic needed to emit init logic for reductions by
moving some duplicated code into a shared util. The logic for doing is
quite involved and is needed for any construct that has reductions.
Moreover, when a construct has both private and reduction clauses, both
sets of clauses need to cooperate with each other when emitting the
logic needed for allocation and initialization. Therefore, this PR
clearly sets the boundaries for the logic needed to initialize
reductions.
2024-12-05 05:23:49 +01:00
Kazu Hirata
7b8cf147ad
[memprof] Update YAML traits for writer purposes (#118720)
For Frames, we prefer the inline notation for the brevity.

For PortableMemInfoBlock, we go through all member fields and print
out those that are populated.
2024-12-04 19:23:27 -08:00