16683 Commits

Author SHA1 Message Date
Vlad Serebrennikov
eaff01f4fc [clang][NFC] Annotate CGExprCXX.cpp with preferred_type
This helps debuggers to display values in bit-fields in a more helpful way.
2024-02-11 15:03:03 +03:00
Vlad Serebrennikov
1ed37606ca [clang][NFC] Annotate CGCleanup.h with preferred_type
This helps debuggers to display values in bit-fields in a more helpful way.
2024-02-11 12:20:34 +03:00
Vlad Serebrennikov
866e073c28 [clang][NFC] Annotate CGRecordLayout.h with preferred_type
This helps debuggers to display values in bit-fields in a more helpful way.
2024-02-11 12:14:31 +03:00
Vlad Serebrennikov
35737beaef [clang][NFC] Annotate CodeGenFunction.h with preferred_type
This helps debuggers to display values in bit-fields in a more helpful way.
2024-02-11 12:11:49 +03:00
Vlad Serebrennikov
fd80304763 [clang][NFC] Annotate CGCUDARuntime.h with preferred_type
This helps debuggers to display values in bit-fields in a more helpful way.
2024-02-11 12:07:27 +03:00
Vlad Serebrennikov
ba0d35181c [clang][NFC] Annotate CGCall.h with preferred_type
This helps debuggers to display values in bit-fields in a more helpful way.
2024-02-11 12:04:55 +03:00
Jon Roelofs
99d743320c
[clang][fmv] Drop .ifunc from target_version's entrypoint's mangling (#81194)
Fixes: https://github.com/llvm/llvm-project/issues/81043
2024-02-09 08:13:15 -08:00
Jan Svoboda
da95d926f6
[clang][lex] Always pass suggested module to InclusionDirective() callback (#81061)
This patch provides more information to the
`PPCallbacks::InclusionDirective()` hook. We now always pass the
suggested module, regardless of whether it was actually imported or not.
The extra `bool ModuleImported` parameter then denotes whether the
header `#include` will be automatically translated into import the the
module.

The main change is in `clang/lib/Lex/PPDirectives.cpp`, where we take
care to not modify `SuggestedModule` after it's been populated by
`LookupHeaderIncludeOrImport()`. We now exclusively use the `SM`
(`ModuleToImport`) variable instead, which has been equivalent to
`SuggestedModule` until now. This allows us to use the original
non-modified `SuggestedModule` for the callback itself.

(This patch turns out to be necessary for
https://github.com/apple/llvm-project/pull/8011).
2024-02-08 10:19:18 -08:00
Cooper Partin
16d1a6486c
[DirectX] Fix HLSL bitshifts to leverage the OpenCL pipeline for bitshifting (#81030)
Fixes #55106

In HLSL bit shifts are defined to shift by shift size % type size. This
contains the following changes:

HLSL codegen bit shifts will be emitted as x << (y & (sizeof(x) - 1) and
bitshift masking leverages the OpenCL pipeline for this.

Tests were also added to validate this behavior.


Before this change the following was being emitted:
; Function Attrs: noinline nounwind optnone
define noundef i32 @"?shl32@@YAHHH@Z"(i32 noundef %V, i32 noundef %S) #0
{
entry:
  %S.addr = alloca i32, align 4
  %V.addr = alloca i32, align 4
  store i32 %S, ptr %S.addr, align 4
  store i32 %V, ptr %V.addr, align 4
  %0 = load i32, ptr %V.addr, align 4
  %1 = load i32, ptr %S.addr, align 4
  %shl = shl i32 %0, %1
  ret i32 %shl
}

After this change:
; Function Attrs: noinline nounwind optnone
define noundef i32 @"?shl32@@YAHHH@Z"(i32 noundef %V, i32 noundef %S) #0
{
entry:
  %S.addr = alloca i32, align 4
  %V.addr = alloca i32, align 4
  store i32 %S, ptr %S.addr, align 4
  store i32 %V, ptr %V.addr, align 4
  %0 = load i32, ptr %V.addr, align 4
  %1 = load i32, ptr %S.addr, align 4
  %shl.mask = and i32 %1, 31
  %shl = shl i32 %0, %shl.mask
  ret i32 %shl
}

---------

Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
2024-02-08 11:50:21 -06:00
Shilei Tian
c4b0dfcc99
[Clang] Fix a non-effective assertion (#81083)
`PTy` here is literally `FTy->getParamType(i)`, which makes this
assertion not
work as expected.
2024-02-08 09:44:42 -05:00
Adam Magier
5f87957fef
[clang][CodeGen][UBSan] Fixing shift-exponent generation for _BitInt (#80515)
Testing the shift-exponent check with small width _BitInt values exposed
a bug in ScalarExprEmitter::GetWidthMinusOneValue when using the result
to determine valid exponent sizes. False positives were reported for
some left shifts when width(LHS)-1 > range(RHS) and false negatives were
reported for right shifts when value(RHS) > range(LHS). This patch caps
the maximum value of GetWidthMinusOneValue to fit within range(RHS) to
fix the issue with left shifts and fixes a code generation in EmitShr to
fix the issue with right shifts and renames the function to
GetMaximumShiftAmount to better reflect the new behaviour.

Fixes #80135.

Co-authored-by: Adam Magier <adam.magier@ericsson.com>
2024-02-06 13:16:55 -06:00
weiguozhi
c166a43c6e
New calling convention preserve_none (#76868)
The new experimental calling convention preserve_none is the opposite
side of existing preserve_all. It tries to preserve as few general
registers as possible. So all general registers are caller saved
registers. It can also uses more general registers to pass arguments.
This attribute doesn't impact floating-point registers. Floating-point
registers still follow the c calling convention.

Currently preserve_none is supported on X86-64 only. It changes the c
calling convention in following fields:
  
* RSP and RBP are the only preserved general registers, all other
general registers are caller saved registers.
* We can use [RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15, RAX]
to pass arguments.

It can improve the performance of hot tailcall chain, because many
callee saved registers' save/restore instructions can be removed if the
tail functions are using preserve_none. In my experiment in protocol
buffer, the parsing functions are improved by 3% to 10%.
2024-02-05 13:28:43 -08:00
Mészáros Gergely
5942868a21
[clang][AMDGPU][CUDA] Handle __builtin_printf for device printf (#68515)
Previously `__builtin_printf` would result to emitting call to `printf`,
even though directly calling `printf` was translated.

Ref: #68478
2024-02-05 23:23:13 +05:30
Dani
6e3e8856d4
[NFC][Clang] Replace Arch with Triplet. (#80465) 2024-02-05 08:25:30 +01:00
Pierre van Houtryve
500846d2f5
[AMDGPU] Introduce Code Object V6 (#76954)
Introduce Code Object V6 in Clang, LLD, Flang and LLVM. This is the same
as V5 except a new "generic version" flag can be present in EFLAGS. This
is related to new generic targets that'll be added in a follow-up patch.
It's also likely V6 will have new changes (possibly new metadata
entries) added later.

Docs change are part of the follow-up patch #76955
2024-02-05 08:19:53 +01:00
Yingwei Zheng
a3d8b78333
[Clang][CodeGen] Mark __dynamic_cast as willreturn (#80409)
According to the C++ standard, `dynamic_cast` of pointers either returns
a pointer (7.6.1.7) or results in undefined behavior (11.9.5). This
patch marks `__dynamic_cast` as `willreturn` to remove unused calls.

Fixes #77606.
2024-02-04 11:31:50 +08:00
Brandon Wu
f5154b9c98
[clang][RISCV] Enable struct of homogeneous scalable vector as function argument (#78550)
llvm IR supports struct as function input, so RISCV tuple
type can just use struct of homogeneous scalable vector instead
of flatten them.
2024-02-03 17:57:15 +08:00
ManuelvOK
c07fcd45f1 [Coverage] Map regions from system headers (#76950)
In 2155195131a57f2f01e7cfabb85bb027518c2dc6, the
"system-headers-coverage" option has been added but not used in all
necessary places.

This is the recommit since it has been reverted in
faef68bca852d08511ea0311d8a0d221cb202e73

Potential reviewers: @gulfemsavrun @petrhosek

Co-authored-by: Manuel Kalettka <manuel.kalettka@kernkonzept.com>
2024-02-02 18:04:24 +09:00
Rahman Lavaee
acec6419e8
[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128)
Today `-split-machine-functions` and `-fbasic-block-sections={all,list}`
cannot be combined with `-basic-block-sections=labels` (the labels
option will be ignored).
The inconsistency comes from the way basic block address map -- the
underlying mechanism for basic block labels -- encodes basic block
addresses
(https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html).
Specifically, basic block offsets are computed relative to the function
begin symbol. This relies on functions being contiguous which is not the
case for MFS and basic block section binaries. This means Propeller
cannot use binary profiles collected from these binaries, which limits
the applicability of Propeller for iterative optimization.
    
To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section
binaries, we propose modifying the encoding of this section as follows.

First let us review the current encoding which emits the address of each
function and its number of basic blocks, followed by basic block entries
for each basic block.

| | |
|--|--|
| Address of the function | Function Address |
|  Number of basic blocks in this function | NumBlocks |
|  BB entry 1
|  BB entry 2
|   ...
|  BB entry #NumBlocks
    
To make this work for basic block sections, we treat each basic block
section similar to a function, except that basic block sections of the
same function must be encapsulated in the same structure so we can map
all of them to their single function.
    
We modify the encoding to first emit the number of basic block sections
(BB ranges) in the function. Then we emit the address map of each basic
block section section as before: the base address of the section, its
number of blocks, and BB entries for its basic block. The first section
in the BB address map is always the function entry section.
| | |
|--|--|
|  Number of sections for this function   | NumBBRanges |
| Section 1 begin address                     | BaseAddress[1]  |
| Number of basic blocks in section 1 | NumBlocks[1]    |
| BB entries for Section 1
|..................|
| Section #NumBBRanges begin address | BaseAddress[NumBBRanges] |
| Number of basic blocks in section #NumBBRanges |
NumBlocks[NumBBRanges] |
| BB entries for Section #NumBBRanges
    
The encoding of basic block entries remains as before with the minor
change that each basic block offset is now computed relative to the
begin symbol of its containing BB section.
    
This patch adds a new boolean codegen option `-basic-block-address-map`.
Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD
flag `--lto-basic-block-address-map` are introduced.
Analogously, we add a new TargetOption field `BBAddrMap`. This means BB
address maps are either generated for all functions in the compiling
unit, or for none (depending on `TargetOptions::BBAddrMap`).
    
This patch keeps the functionality of the old
`-fbasic-block-sections=labels` option but does not remove it. A
subsequent patch will remove the obsolete option.

We refactor the `BasicBlockSections` pass by separating the BB address
map and BB sections handing to their own functions (named
`handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers
basic blocks and places them in their assigned sections.
`handleBBAddrMap` is invoked after `handleBBSections` (if requested) and
only renumbers the blocks.
  - New tests added:
- Two tests basic-block-address-map-with-basic-block-sections.ll and
basic-block-address-map-with-mfs.ll to exercise the combination of
`-basic-block-address-map` with `-basic-block-sections=list` and
'-split-machine-functions`.
- A driver sanity test for the `-fbasic-block-address-map` option
(basic-block-address-map.c).
- An LLD test for testing the `--lto-basic-block-address-map` option.
This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`.
- Renamed and modified the two existing codegen tests for basic block
address map (`basic-block-sections-labels-functions-sections.ll` and
`basic-block-sections-labels.ll`)
- Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of
`SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2
will happen in a separate PR in a few months.
2024-02-01 17:50:46 -08:00
Hana Dusíková
bfc6eaa263
[coverage] fix crash in code coverage and if constexpr with ExprWithCleanups (#80292)
Fixes https://github.com/llvm/llvm-project/issues/80285
2024-02-01 23:31:32 +01:00
Sander de Smalen
d313614b60
[AArch64] Replace LLVM IR function attributes for PSTATE.ZA. (#79166)
Since https://github.com/ARM-software/acle/pull/276 the ACLE
defines attributes to better describe the use of a given SME state.

Previously the attributes merely described the possibility of it being
'shared' or 'preserved', whereas the new attributes have more semantics
and also describe how the data flows through the program.

For ZT0 we already had to add new LLVM IR attributes:
* aarch64_new_zt0
* aarch64_in_zt0
* aarch64_out_zt0
* aarch64_inout_zt0
* aarch64_preserves_zt0

We have now done the same for ZA, such that we add:
* aarch64_new_za       (previously `aarch64_pstate_za_new`)
* aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`)
* aarch64_inout_za (more specific variation of
`aarch64_pstate_za_shared`)
* aarch64_preserves_za (previously `aarch64_pstate_za_shared,
aarch64_pstate_za_preserved`)

This explicitly removes 'pstate' from the name, because with SME2 and
the new ACLE attributes there is a difference between "sharing ZA"
(sharing
the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing
either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).
2024-02-01 13:37:37 +00:00
Kazu Hirata
b67ce7e349 [clang] Use StringRef::starts_with (NFC) 2024-01-31 23:54:09 -08:00
SunilKuravinakop
a74e9ce5dc
[OpenMP] atomic compare weak : Parser & AST support (#79475)
This is a support for " #pragma omp atomic compare weak". It has Parser
& AST support for now.

---------

Authored-by: Sunil Kuravinakop <kuravina@pe28vega.us.cray.com>
2024-01-31 06:32:06 -05:00
Tianlan Zhou
ee01a2c399
[clang] static operators should evaluate object argument (reland) (#80108)
This re-applies 30155fc0 with a fix for clangd.

### Description

clang don't evaluate the object argument of `static operator()` and
`static operator[]` currently, for example:

```cpp
#include <iostream>

struct Foo {
    static int operator()(int x, int y) {
        std::cout << "Foo::operator()" << std::endl;
        return x + y;
    }
    static int operator[](int x, int y) {
        std::cout << "Foo::operator[]" << std::endl;
        return x + y;
    }
};
Foo getFoo() {
    std::cout << "getFoo()" << std::endl;
    return {};
}
int main() {
    std::cout << getFoo()(1, 2) << std::endl;
    std::cout << getFoo()[1, 2] << std::endl;
}
```

`getFoo()` is expected to be called, but clang don't call it currently
(17.0.6). This PR fixes this issue.

Fixes #67976, reland #68485.

### Walkthrough

- **clang/lib/Sema/SemaOverload.cpp**
- **`Sema::CreateOverloadedArraySubscriptExpr` &
`Sema::BuildCallToObjectOfClassType`**
Previously clang generate `CallExpr` for static operators, ignoring the
object argument. In this PR `CXXOperatorCallExpr` is generated for
static operators instead, with the object argument as the first
argument.
  - **`TryObjectArgumentInitialization`**
`const` / `volatile` objects are allowed for static methods, so that we
can call static operators on them.
- **clang/lib/CodeGen/CGExpr.cpp**
  - **`CodeGenFunction::EmitCall`**
CodeGen changes for `CXXOperatorCallExpr` with static operators: emit
and ignore the object argument first, then emit the operator call.
- **clang/lib/AST/ExprConstant.cpp**
  - **`‎ExprEvaluatorBase::handleCallExpr‎`**
Evaluation of static operators in constexpr also need some small changes
to work, so that the arguments won't be out of position.
- **clang/lib/Sema/SemaChecking.cpp**
  - **`Sema::CheckFunctionCall`**
Code for argument checking also need to be modify, or it will fail the
test `clang/test/SemaCXX/overloaded-operator-decl.cpp`.
- **clang-tools-extra/clangd/InlayHints.cpp**
  - **`InlayHintVisitor::VisitCallExpr`**
Now that the `CXXOperatorCallExpr` for static operators also have object
argument, we should also take care of this situation in clangd.

### Tests

- **Added:**
    - **clang/test/AST/ast-dump-static-operators.cpp**
      Verify the AST generated for static operators.
    - **clang/test/SemaCXX/cxx2b-static-operator.cpp**
Static operators should be able to be called on const / volatile
objects.
- **Modified:**
    - **clang/test/CodeGenCXX/cxx2b-static-call-operator.cpp**
    - **clang/test/CodeGenCXX/cxx2b-static-subscript-operator.cpp**
      Matching the new CodeGen.

### Documentation

- **clang/docs/ReleaseNotes.rst**
  Update release notes.

---------

Co-authored-by: Shafik Yaghmour <shafik@users.noreply.github.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2024-01-31 15:27:06 +08:00
Aaron Ballman
201eb2b577 Revert "[clang] static operators should evaluate object argument (#68485)"
This reverts commit 30155fc0ef4fbdce2d79434aaae8d58b2fabb20a.

It seems to have broken some tests in clangd:
http://45.33.8.238/linux/129484/step_9.txt
2024-01-30 13:38:18 -05:00
Tianlan Zhou
30155fc0ef
[clang] static operators should evaluate object argument (#68485)
### Description

clang don't evaluate the object argument of `static operator()` and
`static operator[]` currently, for example:

```cpp
#include <iostream>

struct Foo {
    static int operator()(int x, int y) {
        std::cout << "Foo::operator()" << std::endl;
        return x + y;
    }
    static int operator[](int x, int y) {
        std::cout << "Foo::operator[]" << std::endl;
        return x + y;
    }
};
Foo getFoo() {
    std::cout << "getFoo()" << std::endl;
    return {};
}
int main() {
    std::cout << getFoo()(1, 2) << std::endl;
    std::cout << getFoo()[1, 2] << std::endl;
}
```

`getFoo()` is expected to be called, but clang don't call it currently
(17.0.2). This PR fixes this issue.

Fixes #67976.

### Walkthrough

- **clang/lib/Sema/SemaOverload.cpp**
- **`Sema::CreateOverloadedArraySubscriptExpr` &
`Sema::BuildCallToObjectOfClassType`**
Previously clang generate `CallExpr` for static operators, ignoring the
object argument. In this PR `CXXOperatorCallExpr` is generated for
static operators instead, with the object argument as the first
argument.
  - **`TryObjectArgumentInitialization`**
`const` / `volatile` objects are allowed for static methods, so that we
can call static operators on them.
- **clang/lib/CodeGen/CGExpr.cpp**
  - **`CodeGenFunction::EmitCall`**
CodeGen changes for `CXXOperatorCallExpr` with static operators: emit
and ignore the object argument first, then emit the operator call.
- **clang/lib/AST/ExprConstant.cpp**
  - **`‎ExprEvaluatorBase::handleCallExpr‎`**
Evaluation of static operators in constexpr also need some small changes
to work, so that the arguments won't be out of position.
- **clang/lib/Sema/SemaChecking.cpp**
  - **`Sema::CheckFunctionCall`**
Code for argument checking also need to be modify, or it will fail the
test `clang/test/SemaCXX/overloaded-operator-decl.cpp`.

### Tests

- **Added:**
    - **clang/test/AST/ast-dump-static-operators.cpp**
      Verify the AST generated for static operators.
    - **clang/test/SemaCXX/cxx2b-static-operator.cpp**
Static operators should be able to be called on const / volatile
objects.
- **Modified:**
    - **clang/test/CodeGenCXX/cxx2b-static-call-operator.cpp**
    - **clang/test/CodeGenCXX/cxx2b-static-subscript-operator.cpp**
      Matching the new CodeGen.

### Documentation

- **clang/docs/ReleaseNotes.rst**
  Update release notes.

---------

Co-authored-by: Shafik Yaghmour <shafik@users.noreply.github.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2024-01-30 13:09:05 -05:00
Timm Bäder
c61686e8ab [clang][NFC] Use no-param version of skipRValueSubobjectAdjustments
when possible.
2024-01-30 11:25:28 +01:00
Jonas Paulsson
34dd8ec8ae
[clang, SystemZ] Support -munaligned-symbols (#73511)
When this option is passed to clang, external (and/or weak) symbols
are not assumed to have the minimum ABI alignment normally required.
Symbols defined locally that are not weak are however still given the
minimum alignment.

This is implemented by passing a new parameter to getMinGlobalAlign()
named HasNonWeakDef that is used to return the right alignment value.

This is needed when external symbols created from a linker script may
not get the ABI minimum alignment and must therefore be treated as
unaligned by the compiler.
2024-01-27 18:29:37 +01:00
cor3ntin
ad1a65fcac
[Clang][C++26] Implement Pack Indexing (P2662R3). (#72644)
Implements https://isocpp.org/files/papers/P2662R3.pdf

The feature is exposed as an extension in older language modes.
Mangling is not yet supported and that is something we will have to do before release.
2024-01-27 10:23:38 +01:00
NAKAMURA Takumi
faef68bca8 Revert "[Coverage] Map regions from system headers (#76950)"
See #78920.

This reverts commit ce3e767ac5ea1a1d1a166e88c152e2125ec7662b.
2024-01-27 15:11:37 +09:00
Fangrui Song
36b4a9ccd9
[Driver,CodeGen] Support -mtls-dialect= (#79256)
GCC supports -mtls-dialect= for several architectures to select TLSDESC.
This patch supports the following values

* x86: "gnu". "gnu2" (TLSDESC) is not supported yet.
* RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915)

AArch64 toolchains seem to support TLSDESC from the beginning, and the
general dynamic model has poor support. Nobody seems to use the option
-mtls-dialect= at all, so we don't bother with it.
There also seems very little interest in AArch32's TLSDESC support.

TLSDESC does not change IR, but affects object file generation. Without
a backend option the option is a no-op for in-process ThinLTO.

There seems no motivation to have fine-grained control mixing trad/desc
for TLS, so we just pass -mllvm, and don't bother with a modules flag
metadata or function attribute.

Co-authored-by: Paul Kirth <paulkirth@google.com>
2024-01-26 09:25:38 -08:00
Nemanja Ivanovic
67c1c1dbb6
[PowerPC][X86] Make cpu id builtins target independent and lower for PPC (#68919)
Make __builtin_cpu_{init|supports|is} target independent and provide an
opt-in query for targets that want to support it. Each target is still
responsible for their specific lowering/code-gen. Also provide code-gen
for PowerPC.

I originally proposed this in https://reviews.llvm.org/D152914 and this
addresses the comments I received there.

---------

Co-authored-by: Nemanja Ivanovic <nemanjaivanovic@nemanjas-air.kpn>
Co-authored-by: Nemanja Ivanovic <nemanja@synopsys.com>
2024-01-26 11:24:50 -05:00
Craig Topper
c92ad411f2 Recommit "[RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551)"
Test updated to expect i8 gep.

Original message:

This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.

A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.
2024-01-25 10:20:29 -08:00
Craig Topper
51b25bad5e Revert "[RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551)"
This reverts commit b0511419b3fd71fa8f8c3618b7e849aabd2ccf65.

Test failure was reported.
2024-01-25 09:38:11 -08:00
Craig Topper
b0511419b3
[RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551)
This adopts a similar behavior to AArch64 SVE, where bool vectors are
represented as a vector of chars with 1/8 the number of elements. This
ensures the vector always occupies a power of 2 number of bytes.

A consequence of this is that vbool64_t, vbool32_t, and vool16_t can
only be used with a vector length that guarantees at least 8 bits.
2024-01-25 09:14:52 -08:00
Alexandre Ganea
419d6ea135 [clang] Silence warning when compiling with MSVC targetting x86
This fixes:
```
[3963/6996] Building CXX object tools\clang\lib\CodeGen\CMakeFiles\obj.clangCodeGen.dir\CGExpr.cpp.obj
C:\git\llvm-project\clang\lib\CodeGen\CGExpr.cpp(3808): warning C4018: '<=': signed/unsigned mismatch
```
2024-01-25 09:34:17 -05:00
Vojislav Tomasevic
2a77d92e2e
[clang] Incorrect IR involving the use of bcopy (#79298)
This patch addresses the issue regarding the call of bcopy function in a
conditional expression.
It is analogous to the already accepted patch which deals with the same
problem, just regarding the bzero function [0].

Here is the testcase which illustrates the issue:

```
void bcopy(const void *, void *, unsigned long);
void foo(void);

void test_bcopy() {
  char dst[20];
  char src[20];
  int _sz = 20, len = 20;
  return (_sz
          ? ((_sz >= len)
             ? bcopy(src, dst, len)
             : foo())
          : bcopy(src, dst, len));
}
```

When processing it with clang, following issue occurs:

Instruction does not dominate all uses!
%arraydecay2 = getelementptr inbounds [20 x i8], ptr %dst, i64 0, i64 0,
!dbg !38
%cond = phi ptr [ %arraydecay2, %cond.end ], [ %arraydecay5,
%cond.false3 ], !dbg !33
fatal error: error in backend: Broken module found, compilation aborted!

This happens because an incorrect phi node is created. It is created
because bcopy function call is lowered to the call of llvm.memmove
intrinsic and function memmove returns void *. Since llvm.memmove is
called in two places in the same return statement, clang creates a phi
node in the final basic block for the return value and that phi node is
incorrect. However, bcopy function should return void in the first
place, so this phi node is unnecessary. This is what this patch
addresses. An appropriate test is also added and no existing tests fail
when applying this patch.

Also, this crash only happens when LLVM is configured with
-DLLVM_ENABLE_ASSERTIONS=On option.

[0] https://reviews.llvm.org/D39746
2024-01-24 09:39:36 -08:00
Mirko Brkušanin
7fdf608cef
[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795)
Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com>
Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>
2024-01-24 13:43:07 +01:00
Paul Kirth
9d476e1e1a
[clang][FatLTO] Avoid UnifiedLTO until it can support WPD/CFI (#79061)
Currently, the UnifiedLTO pipeline seems to have trouble with several
LTO features, like SplitLTO units, which means we cannot use important
optimizations like Whole Program Devirtualization or security hardening
instrumentation like CFI.

This patch reverts FatLTO to using distinct pipelines for Full LTO and
ThinLTO. It still avoids module cloning, since that was error prone.
2024-01-23 14:04:52 -08:00
Sander de Smalen
1652d44d8d
[Clang] Amend SME attributes with support for ZT0. (#77941)
This patch builds on top of #76971 and implements support for:
* __arm_new("zt0")
* __arm_in("zt0")
* __arm_out("zt0")
* __arm_inout("zt0")
* __arm_preserves("zt0")
2024-01-23 12:35:16 +01:00
ManuelvOK
ce3e767ac5
[Coverage] Map regions from system headers (#76950)
In 2155195131a57f2f01e7cfabb85bb027518c2dc6, the
"system-headers-coverage" option has been added but not used in all
necessary places.

Potential reviewers: @gulfemsavrun @petrhosek

Co-authored-by: Manuel Kalettka <manuel.kalettka@kernkonzept.com>
2024-01-22 21:41:49 -08:00
Eli Friedman
a6065f0fa5
Arm64EC entry/exit thunks, consolidated. (#79067)
This combines the previously posted patches with some additional work
I've done to more closely match MSVC output.

Most of the important logic here is implemented in
AArch64Arm64ECCallLowering. The purpose of the
AArch64Arm64ECCallLowering is to take "normal" IR we'd generate for
other targets, and generate most of the Arm64EC-specific bits:
generating thunks, mangling symbols, generating aliases, and generating
the .hybmp$x table. This is all done late for a few reasons: to
consolidate the logic as much as possible, and to ensure the IR exposed
to optimization passes doesn't contain complex arm64ec-specific
constructs.

The other changes are supporting changes, to handle the new constructs
generated by that pass.

There's a global llvm.arm64ec.symbolmap representing the .hybmp$x
entries for the thunks. This gets handled directly by the AsmPrinter
because it needs symbol indexes that aren't available before that.

There are two new calling conventions used to represent calls to and
from thunks: ARM64EC_Thunk_X64 and ARM64EC_Thunk_Native. There are a few
changes to handle the associated exception-handling info,
SEH_SaveAnyRegQP and SEH_SaveAnyRegQPX.

I've intentionally left out handling for structs with small
non-power-of-two sizes, because that's easily separated out. The rest of
my current work is here. I squashed my current patches because they were
split in ways that didn't really make sense. Maybe I could split out
some bits, but it's hard to meaningfully test most of the parts
independently.

Thanks to @dpaoliello for extensive testing and suggestions.

(Originally posted as https://reviews.llvm.org/D157547 .)
2024-01-22 21:28:07 -08:00
Alan Phipps
424b9cf41a
[Coverage][clang] Ensure bitmap for ternary condition is updated before visiting children (#78814)
This is a fix for MC/DC issue https://github.com/llvm/llvm-project/issues/78453 in which a ConditionalOperator that evaluates a complex condition was incorrectly updating its global bitmap after visiting its LHS and RHS children. This
was wrong because if the LHS or RHS also evaluate a complex condition, the MCDC temporary bitmap value will get corrupted. The fix is to ensure that the bitmap is updated prior to visiting the LHS and RHS.
2024-01-22 16:33:20 -06:00
Jan Patrick Lehr
fa4780fa6c
[OpenMP][USM] Introduces -fopenmp-force-usm flag (#76571)
This flag forces the compiler to generate code for OpenMP target regions
as if the user specified the #pragma omp requires unified_shared_memory
in each source file.

The option does not have a -fno-* friend since OpenMP requires the
unified_shared_memory clause to be present in all source files. Since
this flag does no harm if the clause is present, it can be used in
conjunction. My understanding is that USM should not be turned off
selectively, hence, no -fno- version.

This adds a basic test to check the correct generation of double
indirect access to declare target globals in USM mode vs non-USM mode.
Which I think is the only difference observable in code generation.

This runtime test checks for the (non-)occurence of data movement between host
and device. It does one run without the flag and one with the flag to
also see that both versions behave as expected. In the case w/o the new
flag data movement between host and device is expected. In the case with
the flag such data movement should not be present / reported.
2024-01-22 21:59:26 +01:00
Zahira Ammarguellat
364a5b5b85
Fix a bug in implementation of Smith's algorithm used in complex div. (#78330)
This patch fixes a bug in Smith's algorithm (thanks to @andykaylor who
detected it) and makes sure that last option in command line rules.
2024-01-22 15:50:24 -05:00
Dani
1be0d9d7d8
[AArch64][Clang] Fix linker error for function multiversioning (#74358)
AArch64 part of https://github.com/llvm/llvm-project/pull/71706.

Default version is now mangled with .default.
Resolver for the TargetVersion need to be emitted from the
CodeGenModule::EmitMultiVersionFunctionDefinition.
2024-01-22 19:55:16 +01:00
Matthew Devereau
6ba62f4f25
[AArch64][SME2] Refine fcvtu/fcvts/scvtf/ucvtf (#77947)
Rename intrinsics for fcvtu to fcvtzu and fcvts to fcvtzs.

Use llvm_anyvector_ty for both multi vector returns and operands,
therefore the return and operands can be specified in the intrinsic
call, e.g.

@llvm.aarch64.sve.scvtf.x4.nxv4f32.nxv4i32
2024-01-22 15:11:49 +00:00
Hana Dusíková
865e4a1f33
[coverage] skipping code coverage for 'if constexpr' and 'if consteval' (#78033)
`if constexpr` and `if consteval` conditional statements code coverage
should behave more like a preprocesor `#if`-s than normal
ConditionalStmt. This PR should fix that.

---------

Co-authored-by: cor3ntin <corentinjabot@gmail.com>
2024-01-22 12:50:20 +01:00
David Chisnall
f36845d0c6
Enable direct methods and fast alloc calls for libobjc2. (#78030)
These will be supported in the upcoming 2.2 release and so are gated on
that version.

Direct methods call `objc_send_initialize` if they are class methods
that may not have called initialize. This is guarded by checking for the
class flag bit that is set on initialisation in the class. This bit now
forms part of the ABI, but it's been stable for 30+ years so that's fine
as a contract going forwards.
2024-01-22 08:38:41 +00:00
Andrey Ali Khan Bolshakov
5518a9d767
[c++20] P1907R1: Support for generalized non-type template arguments of scalar type. (#78041)
Previously committed as 9e08e51a20d0d2b1c5724bb17e969d036fced4cd, and
reverted because a dependency commit was reverted, then committed again
as 4b574008aef5a7235c1f894ab065fe300d26e786 and reverted again because
"dependency commit" 5a391d38ac6c561ba908334d427f26124ed9132e was
reverted. But it doesn't seem that 5a391d38ac6c was a real dependency
for this.

This commit incorporates 4b574008aef5a7235c1f894ab065fe300d26e786 and
18e093faf726d15f210ab4917142beec51848258 by Richard Smith (@zygoloid),
with some minor fixes, most notably:

- `UncommonValue` renamed to `StructuralValue`

- `VK_PRValue` instead of `VK_RValue` as default kind in lvalue and
member pointer handling branch in
`BuildExpressionFromNonTypeTemplateArgumentValue`;

- handling of `StructuralValue` in `IsTypeDeclaredInsideVisitor`;

- filling in `SugaredConverted` along with `CanonicalConverted`
parameter in `Sema::CheckTemplateArgument`;

- minor cleanup in
`TemplateInstantiator::transformNonTypeTemplateParmRef`;

- `TemplateArgument` constructors refactored;

- `ODRHash` calculation for `UncommonValue`;

- USR generation for `UncommonValue`;

- more correct MS compatibility mangling algorithm (tested on MSVC ver.
19.35; toolset ver. 143);

- IR emitting fixed on using a subobject as a template argument when the
corresponding template parameter is used in an lvalue context;

- `noundef` attribute and opaque pointers in `template-arguments` test;

- analysis for C++17 mode is turned off for templates in
`warn-bool-conversion` test; in C++17 and C++20 mode, array reference
used as a template argument of pointer type produces template argument
of UncommonValue type, and
`BuildExpressionFromNonTypeTemplateArgumentValue` makes
`OpaqueValueExpr` for it, and `DiagnoseAlwaysNonNullPointer` cannot see
through it; despite of "These cases should not warn" comment, I'm not
sure about correct behavior; I'd expect a suggestion to replace `if` by
`if constexpr`;

- `temp.arg.nontype/p1.cpp` and `dr18xx.cpp` tests fixed.
2024-01-21 21:28:57 +01:00