llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-16 08:16:07 +00:00

Author	SHA1	Message	Date
Vlad Serebrennikov	eaff01f4fc	[clang][NFC] Annotate `CGExprCXX.cpp` with `preferred_type` This helps debuggers to display values in bit-fields in a more helpful way.	2024-02-11 15:03:03 +03:00
Vlad Serebrennikov	1ed37606ca	[clang][NFC] Annotate `CGCleanup.h` with `preferred_type` This helps debuggers to display values in bit-fields in a more helpful way.	2024-02-11 12:20:34 +03:00
Vlad Serebrennikov	866e073c28	[clang][NFC] Annotate `CGRecordLayout.h` with `preferred_type` This helps debuggers to display values in bit-fields in a more helpful way.	2024-02-11 12:14:31 +03:00
Vlad Serebrennikov	35737beaef	[clang][NFC] Annotate `CodeGenFunction.h` with `preferred_type` This helps debuggers to display values in bit-fields in a more helpful way.	2024-02-11 12:11:49 +03:00
Vlad Serebrennikov	fd80304763	[clang][NFC] Annotate `CGCUDARuntime.h` with `preferred_type` This helps debuggers to display values in bit-fields in a more helpful way.	2024-02-11 12:07:27 +03:00
Vlad Serebrennikov	ba0d35181c	[clang][NFC] Annotate `CGCall.h` with `preferred_type` This helps debuggers to display values in bit-fields in a more helpful way.	2024-02-11 12:04:55 +03:00
Jon Roelofs	99d743320c	[clang][fmv] Drop .ifunc from target_version's entrypoint's mangling (#81194 ) Fixes: https://github.com/llvm/llvm-project/issues/81043	2024-02-09 08:13:15 -08:00
Jan Svoboda	da95d926f6	[clang][lex] Always pass suggested module to `InclusionDirective()` callback (#81061 ) This patch provides more information to the `PPCallbacks::InclusionDirective()` hook. We now always pass the suggested module, regardless of whether it was actually imported or not. The extra `bool ModuleImported` parameter then denotes whether the header `#include` will be automatically translated into import the the module. The main change is in `clang/lib/Lex/PPDirectives.cpp`, where we take care to not modify `SuggestedModule` after it's been populated by `LookupHeaderIncludeOrImport()`. We now exclusively use the `SM` (`ModuleToImport`) variable instead, which has been equivalent to `SuggestedModule` until now. This allows us to use the original non-modified `SuggestedModule` for the callback itself. (This patch turns out to be necessary for https://github.com/apple/llvm-project/pull/8011).	2024-02-08 10:19:18 -08:00
Cooper Partin	16d1a6486c	[DirectX] Fix HLSL bitshifts to leverage the OpenCL pipeline for bitshifting (#81030 ) Fixes #55106 In HLSL bit shifts are defined to shift by shift size % type size. This contains the following changes: HLSL codegen bit shifts will be emitted as x << (y & (sizeof(x) - 1) and bitshift masking leverages the OpenCL pipeline for this. Tests were also added to validate this behavior. Before this change the following was being emitted: ; Function Attrs: noinline nounwind optnone define noundef i32 @"?shl32@@YAHHH@Z"(i32 noundef %V, i32 noundef %S) #0 { entry: %S.addr = alloca i32, align 4 %V.addr = alloca i32, align 4 store i32 %S, ptr %S.addr, align 4 store i32 %V, ptr %V.addr, align 4 %0 = load i32, ptr %V.addr, align 4 %1 = load i32, ptr %S.addr, align 4 %shl = shl i32 %0, %1 ret i32 %shl } After this change: ; Function Attrs: noinline nounwind optnone define noundef i32 @"?shl32@@YAHHH@Z"(i32 noundef %V, i32 noundef %S) #0 { entry: %S.addr = alloca i32, align 4 %V.addr = alloca i32, align 4 store i32 %S, ptr %S.addr, align 4 store i32 %V, ptr %V.addr, align 4 %0 = load i32, ptr %V.addr, align 4 %1 = load i32, ptr %S.addr, align 4 %shl.mask = and i32 %1, 31 %shl = shl i32 %0, %shl.mask ret i32 %shl } --------- Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>	2024-02-08 11:50:21 -06:00
Shilei Tian	c4b0dfcc99	[Clang] Fix a non-effective assertion (#81083 ) `PTy` here is literally `FTy->getParamType(i)`, which makes this assertion not work as expected.	2024-02-08 09:44:42 -05:00
Adam Magier	5f87957fef	[clang][CodeGen][UBSan] Fixing shift-exponent generation for _BitInt (#80515 ) Testing the shift-exponent check with small width _BitInt values exposed a bug in ScalarExprEmitter::GetWidthMinusOneValue when using the result to determine valid exponent sizes. False positives were reported for some left shifts when width(LHS)-1 > range(RHS) and false negatives were reported for right shifts when value(RHS) > range(LHS). This patch caps the maximum value of GetWidthMinusOneValue to fit within range(RHS) to fix the issue with left shifts and fixes a code generation in EmitShr to fix the issue with right shifts and renames the function to GetMaximumShiftAmount to better reflect the new behaviour. Fixes #80135. Co-authored-by: Adam Magier <adam.magier@ericsson.com>	2024-02-06 13:16:55 -06:00
weiguozhi	c166a43c6e	New calling convention preserve_none (#76868 ) The new experimental calling convention preserve_none is the opposite side of existing preserve_all. It tries to preserve as few general registers as possible. So all general registers are caller saved registers. It can also uses more general registers to pass arguments. This attribute doesn't impact floating-point registers. Floating-point registers still follow the c calling convention. Currently preserve_none is supported on X86-64 only. It changes the c calling convention in following fields: * RSP and RBP are the only preserved general registers, all other general registers are caller saved registers. * We can use [RDI, RSI, RDX, RCX, R8, R9, R11, R12, R13, R14, R15, RAX] to pass arguments. It can improve the performance of hot tailcall chain, because many callee saved registers' save/restore instructions can be removed if the tail functions are using preserve_none. In my experiment in protocol buffer, the parsing functions are improved by 3% to 10%.	2024-02-05 13:28:43 -08:00
Mészáros Gergely	5942868a21	[clang][AMDGPU][CUDA] Handle __builtin_printf for device printf (#68515 ) Previously `__builtin_printf` would result to emitting call to `printf`, even though directly calling `printf` was translated. Ref: #68478	2024-02-05 23:23:13 +05:30
Dani	6e3e8856d4	[NFC][Clang] Replace Arch with Triplet. (#80465 )	2024-02-05 08:25:30 +01:00
Pierre van Houtryve	500846d2f5	[AMDGPU] Introduce Code Object V6 (#76954 ) Introduce Code Object V6 in Clang, LLD, Flang and LLVM. This is the same as V5 except a new "generic version" flag can be present in EFLAGS. This is related to new generic targets that'll be added in a follow-up patch. It's also likely V6 will have new changes (possibly new metadata entries) added later. Docs change are part of the follow-up patch #76955	2024-02-05 08:19:53 +01:00
Yingwei Zheng	a3d8b78333	[Clang][CodeGen] Mark `__dynamic_cast` as `willreturn` (#80409 ) According to the C++ standard, `dynamic_cast` of pointers either returns a pointer (7.6.1.7) or results in undefined behavior (11.9.5). This patch marks `__dynamic_cast` as `willreturn` to remove unused calls. Fixes #77606.	2024-02-04 11:31:50 +08:00
Brandon Wu	f5154b9c98	[clang][RISCV] Enable struct of homogeneous scalable vector as function argument (#78550 ) llvm IR supports struct as function input, so RISCV tuple type can just use struct of homogeneous scalable vector instead of flatten them.	2024-02-03 17:57:15 +08:00
ManuelvOK	c07fcd45f1	[Coverage] Map regions from system headers (#76950 ) In 2155195131a57f2f01e7cfabb85bb027518c2dc6, the "system-headers-coverage" option has been added but not used in all necessary places. This is the recommit since it has been reverted in faef68bca852d08511ea0311d8a0d221cb202e73 Potential reviewers: @gulfemsavrun @petrhosek Co-authored-by: Manuel Kalettka <manuel.kalettka@kernkonzept.com>	2024-02-02 18:04:24 +09:00
Rahman Lavaee	acec6419e8	[SHT_LLVM_BB_ADDR_MAP] Allow basic-block-sections and labels be used together by decoupling the handling of the two features. (#74128 ) Today `-split-machine-functions` and `-fbasic-block-sections={all,list}` cannot be combined with `-basic-block-sections=labels` (the labels option will be ignored). The inconsistency comes from the way basic block address map -- the underlying mechanism for basic block labels -- encodes basic block addresses (https://lists.llvm.org/pipermail/llvm-dev/2020-July/143512.html). Specifically, basic block offsets are computed relative to the function begin symbol. This relies on functions being contiguous which is not the case for MFS and basic block section binaries. This means Propeller cannot use binary profiles collected from these binaries, which limits the applicability of Propeller for iterative optimization. To make the `SHT_LLVM_BB_ADDR_MAP` feature work with basic block section binaries, we propose modifying the encoding of this section as follows. First let us review the current encoding which emits the address of each function and its number of basic blocks, followed by basic block entries for each basic block. \| \| \| \|--\|--\| \| Address of the function \| Function Address \| \| Number of basic blocks in this function \| NumBlocks \| \| BB entry 1 \| BB entry 2 \| ... \| BB entry #NumBlocks To make this work for basic block sections, we treat each basic block section similar to a function, except that basic block sections of the same function must be encapsulated in the same structure so we can map all of them to their single function. We modify the encoding to first emit the number of basic block sections (BB ranges) in the function. Then we emit the address map of each basic block section section as before: the base address of the section, its number of blocks, and BB entries for its basic block. The first section in the BB address map is always the function entry section. \| \| \| \|--\|--\| \| Number of sections for this function \| NumBBRanges \| \| Section 1 begin address \| BaseAddress[1] \| \| Number of basic blocks in section 1 \| NumBlocks[1] \| \| BB entries for Section 1 \|..................\| \| Section #NumBBRanges begin address \| BaseAddress[NumBBRanges] \| \| Number of basic blocks in section #NumBBRanges \| NumBlocks[NumBBRanges] \| \| BB entries for Section #NumBBRanges The encoding of basic block entries remains as before with the minor change that each basic block offset is now computed relative to the begin symbol of its containing BB section. This patch adds a new boolean codegen option `-basic-block-address-map`. Correspondingly, the front-end flag `-fbasic-block-address-map` and LLD flag `--lto-basic-block-address-map` are introduced. Analogously, we add a new TargetOption field `BBAddrMap`. This means BB address maps are either generated for all functions in the compiling unit, or for none (depending on `TargetOptions::BBAddrMap`). This patch keeps the functionality of the old `-fbasic-block-sections=labels` option but does not remove it. A subsequent patch will remove the obsolete option. We refactor the `BasicBlockSections` pass by separating the BB address map and BB sections handing to their own functions (named `handleBBAddrMap` and `handleBBSections`). `handleBBSections` renumbers basic blocks and places them in their assigned sections. `handleBBAddrMap` is invoked after `handleBBSections` (if requested) and only renumbers the blocks. - New tests added: - Two tests basic-block-address-map-with-basic-block-sections.ll and basic-block-address-map-with-mfs.ll to exercise the combination of `-basic-block-address-map` with `-basic-block-sections=list` and '-split-machine-functions`. - A driver sanity test for the `-fbasic-block-address-map` option (basic-block-address-map.c). - An LLD test for testing the `--lto-basic-block-address-map` option. This reuses the LLVM IR from `lld/test/ELF/lto/basic-block-sections.ll`. - Renamed and modified the two existing codegen tests for basic block address map (`basic-block-sections-labels-functions-sections.ll` and `basic-block-sections-labels.ll`) - Removed `SHT_LLVM_BB_ADDR_MAP_V0` tests. Full deprecation of `SHT_LLVM_BB_ADDR_MAP_V0` and `SHT_LLVM_BB_ADDR_MAP` version less than 2 will happen in a separate PR in a few months.	2024-02-01 17:50:46 -08:00
Hana Dusíková	bfc6eaa263	[coverage] fix crash in code coverage and `if constexpr` with `ExprWithCleanups` (#80292 ) Fixes https://github.com/llvm/llvm-project/issues/80285	2024-02-01 23:31:32 +01:00
Sander de Smalen	d313614b60	[AArch64] Replace LLVM IR function attributes for PSTATE.ZA. (#79166 ) Since https://github.com/ARM-software/acle/pull/276 the ACLE defines attributes to better describe the use of a given SME state. Previously the attributes merely described the possibility of it being 'shared' or 'preserved', whereas the new attributes have more semantics and also describe how the data flows through the program. For ZT0 we already had to add new LLVM IR attributes: * aarch64_new_zt0 * aarch64_in_zt0 * aarch64_out_zt0 * aarch64_inout_zt0 * aarch64_preserves_zt0 We have now done the same for ZA, such that we add: * aarch64_new_za (previously `aarch64_pstate_za_new`) * aarch64_in_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_out_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_inout_za (more specific variation of `aarch64_pstate_za_shared`) * aarch64_preserves_za (previously `aarch64_pstate_za_shared, aarch64_pstate_za_preserved`) This explicitly removes 'pstate' from the name, because with SME2 and the new ACLE attributes there is a difference between "sharing ZA" (sharing the ZA matrix register with the caller) and "sharing PSTATE.ZA" (sharing either the ZA or ZT0 register, both part of PSTATE.ZA with the caller).	2024-02-01 13:37:37 +00:00
Kazu Hirata	b67ce7e349	[clang] Use StringRef::starts_with (NFC)	2024-01-31 23:54:09 -08:00
SunilKuravinakop	a74e9ce5dc	[OpenMP] atomic compare weak : Parser & AST support (#79475 ) This is a support for " #pragma omp atomic compare weak". It has Parser & AST support for now. --------- Authored-by: Sunil Kuravinakop <kuravina@pe28vega.us.cray.com>	2024-01-31 06:32:06 -05:00
Tianlan Zhou	ee01a2c399	[clang] static operators should evaluate object argument (reland) (#80108 ) This re-applies 30155fc0 with a fix for clangd. ### Description clang don't evaluate the object argument of `static operator()` and `static operator[]` currently, for example: ```cpp #include <iostream> struct Foo { static int operator()(int x, int y) { std::cout << "Foo::operator()" << std::endl; return x + y; } static int operator[](int x, int y) { std::cout << "Foo::operator[]" << std::endl; return x + y; } }; Foo getFoo() { std::cout << "getFoo()" << std::endl; return {}; } int main() { std::cout << getFoo()(1, 2) << std::endl; std::cout << getFoo()[1, 2] << std::endl; } ``` `getFoo()` is expected to be called, but clang don't call it currently (17.0.6). This PR fixes this issue. Fixes #67976, reland #68485. ### Walkthrough - clang/lib/Sema/SemaOverload.cpp - `Sema::CreateOverloadedArraySubscriptExpr` & `Sema::BuildCallToObjectOfClassType` Previously clang generate `CallExpr` for static operators, ignoring the object argument. In this PR `CXXOperatorCallExpr` is generated for static operators instead, with the object argument as the first argument. - `TryObjectArgumentInitialization` `const` / `volatile` objects are allowed for static methods, so that we can call static operators on them. - clang/lib/CodeGen/CGExpr.cpp - `CodeGenFunction::EmitCall` CodeGen changes for `CXXOperatorCallExpr` with static operators: emit and ignore the object argument first, then emit the operator call. - clang/lib/AST/ExprConstant.cpp - `‎ExprEvaluatorBase::handleCallExpr‎` Evaluation of static operators in constexpr also need some small changes to work, so that the arguments won't be out of position. - clang/lib/Sema/SemaChecking.cpp - `Sema::CheckFunctionCall` Code for argument checking also need to be modify, or it will fail the test `clang/test/SemaCXX/overloaded-operator-decl.cpp`. - clang-tools-extra/clangd/InlayHints.cpp - `InlayHintVisitor::VisitCallExpr` Now that the `CXXOperatorCallExpr` for static operators also have object argument, we should also take care of this situation in clangd. ### Tests - Added: - clang/test/AST/ast-dump-static-operators.cpp Verify the AST generated for static operators. - clang/test/SemaCXX/cxx2b-static-operator.cpp Static operators should be able to be called on const / volatile objects. - Modified: - clang/test/CodeGenCXX/cxx2b-static-call-operator.cpp - clang/test/CodeGenCXX/cxx2b-static-subscript-operator.cpp Matching the new CodeGen. ### Documentation - clang/docs/ReleaseNotes.rst Update release notes. --------- Co-authored-by: Shafik Yaghmour <shafik@users.noreply.github.com> Co-authored-by: cor3ntin <corentinjabot@gmail.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com>	2024-01-31 15:27:06 +08:00
Aaron Ballman	201eb2b577	Revert "[clang] static operators should evaluate object argument (#68485 )" This reverts commit 30155fc0ef4fbdce2d79434aaae8d58b2fabb20a. It seems to have broken some tests in clangd: http://45.33.8.238/linux/129484/step_9.txt	2024-01-30 13:38:18 -05:00
Tianlan Zhou	30155fc0ef	[clang] static operators should evaluate object argument (#68485 ) ### Description clang don't evaluate the object argument of `static operator()` and `static operator[]` currently, for example: ```cpp #include <iostream> struct Foo { static int operator()(int x, int y) { std::cout << "Foo::operator()" << std::endl; return x + y; } static int operator[](int x, int y) { std::cout << "Foo::operator[]" << std::endl; return x + y; } }; Foo getFoo() { std::cout << "getFoo()" << std::endl; return {}; } int main() { std::cout << getFoo()(1, 2) << std::endl; std::cout << getFoo()[1, 2] << std::endl; } ``` `getFoo()` is expected to be called, but clang don't call it currently (17.0.2). This PR fixes this issue. Fixes #67976. ### Walkthrough - clang/lib/Sema/SemaOverload.cpp - `Sema::CreateOverloadedArraySubscriptExpr` & `Sema::BuildCallToObjectOfClassType` Previously clang generate `CallExpr` for static operators, ignoring the object argument. In this PR `CXXOperatorCallExpr` is generated for static operators instead, with the object argument as the first argument. - `TryObjectArgumentInitialization` `const` / `volatile` objects are allowed for static methods, so that we can call static operators on them. - clang/lib/CodeGen/CGExpr.cpp - `CodeGenFunction::EmitCall` CodeGen changes for `CXXOperatorCallExpr` with static operators: emit and ignore the object argument first, then emit the operator call. - clang/lib/AST/ExprConstant.cpp - `‎ExprEvaluatorBase::handleCallExpr‎` Evaluation of static operators in constexpr also need some small changes to work, so that the arguments won't be out of position. - clang/lib/Sema/SemaChecking.cpp - `Sema::CheckFunctionCall` Code for argument checking also need to be modify, or it will fail the test `clang/test/SemaCXX/overloaded-operator-decl.cpp`. ### Tests - Added: - clang/test/AST/ast-dump-static-operators.cpp Verify the AST generated for static operators. - clang/test/SemaCXX/cxx2b-static-operator.cpp Static operators should be able to be called on const / volatile objects. - Modified: - clang/test/CodeGenCXX/cxx2b-static-call-operator.cpp - clang/test/CodeGenCXX/cxx2b-static-subscript-operator.cpp Matching the new CodeGen. ### Documentation - clang/docs/ReleaseNotes.rst Update release notes. --------- Co-authored-by: Shafik Yaghmour <shafik@users.noreply.github.com> Co-authored-by: cor3ntin <corentinjabot@gmail.com> Co-authored-by: Aaron Ballman <aaron@aaronballman.com>	2024-01-30 13:09:05 -05:00
Timm Bäder	c61686e8ab	[clang][NFC] Use no-param version of skipRValueSubobjectAdjustments when possible.	2024-01-30 11:25:28 +01:00
Jonas Paulsson	34dd8ec8ae	[clang, SystemZ] Support -munaligned-symbols (#73511 ) When this option is passed to clang, external (and/or weak) symbols are not assumed to have the minimum ABI alignment normally required. Symbols defined locally that are not weak are however still given the minimum alignment. This is implemented by passing a new parameter to getMinGlobalAlign() named HasNonWeakDef that is used to return the right alignment value. This is needed when external symbols created from a linker script may not get the ABI minimum alignment and must therefore be treated as unaligned by the compiler.	2024-01-27 18:29:37 +01:00
cor3ntin	ad1a65fcac	[Clang][C++26] Implement Pack Indexing (P2662R3). (#72644 ) Implements https://isocpp.org/files/papers/P2662R3.pdf The feature is exposed as an extension in older language modes. Mangling is not yet supported and that is something we will have to do before release.	2024-01-27 10:23:38 +01:00
NAKAMURA Takumi	faef68bca8	Revert "[Coverage] Map regions from system headers (#76950 )" See #78920. This reverts commit ce3e767ac5ea1a1d1a166e88c152e2125ec7662b.	2024-01-27 15:11:37 +09:00
Fangrui Song	36b4a9ccd9	[Driver,CodeGen] Support -mtls-dialect= (#79256 ) GCC supports -mtls-dialect= for several architectures to select TLSDESC. This patch supports the following values * x86: "gnu". "gnu2" (TLSDESC) is not supported yet. * RISC-V: "trad" (general dynamic), "desc" (TLSDESC, see #66915) AArch64 toolchains seem to support TLSDESC from the beginning, and the general dynamic model has poor support. Nobody seems to use the option -mtls-dialect= at all, so we don't bother with it. There also seems very little interest in AArch32's TLSDESC support. TLSDESC does not change IR, but affects object file generation. Without a backend option the option is a no-op for in-process ThinLTO. There seems no motivation to have fine-grained control mixing trad/desc for TLS, so we just pass -mllvm, and don't bother with a modules flag metadata or function attribute. Co-authored-by: Paul Kirth <paulkirth@google.com>	2024-01-26 09:25:38 -08:00
Nemanja Ivanovic	67c1c1dbb6	[PowerPC][X86] Make cpu id builtins target independent and lower for PPC (#68919 ) Make __builtin_cpu_{init\|supports\|is} target independent and provide an opt-in query for targets that want to support it. Each target is still responsible for their specific lowering/code-gen. Also provide code-gen for PowerPC. I originally proposed this in https://reviews.llvm.org/D152914 and this addresses the comments I received there. --------- Co-authored-by: Nemanja Ivanovic <nemanjaivanovic@nemanjas-air.kpn> Co-authored-by: Nemanja Ivanovic <nemanja@synopsys.com>	2024-01-26 11:24:50 -05:00
Craig Topper	c92ad411f2	Recommit "[RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551 )" Test updated to expect i8 gep. Original message: This adopts a similar behavior to AArch64 SVE, where bool vectors are represented as a vector of chars with 1/8 the number of elements. This ensures the vector always occupies a power of 2 number of bytes. A consequence of this is that vbool64_t, vbool32_t, and vool16_t can only be used with a vector length that guarantees at least 8 bits.	2024-01-25 10:20:29 -08:00
Craig Topper	51b25bad5e	Revert "[RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551 )" This reverts commit b0511419b3fd71fa8f8c3618b7e849aabd2ccf65. Test failure was reported.	2024-01-25 09:38:11 -08:00
Craig Topper	b0511419b3	[RISCV] Support __riscv_v_fixed_vlen for vbool types. (#76551 ) This adopts a similar behavior to AArch64 SVE, where bool vectors are represented as a vector of chars with 1/8 the number of elements. This ensures the vector always occupies a power of 2 number of bytes. A consequence of this is that vbool64_t, vbool32_t, and vool16_t can only be used with a vector length that guarantees at least 8 bits.	2024-01-25 09:14:52 -08:00
Alexandre Ganea	419d6ea135	[clang] Silence warning when compiling with MSVC targetting x86 This fixes: ``` [3963/6996] Building CXX object tools\clang\lib\CodeGen\CMakeFiles\obj.clangCodeGen.dir\CGExpr.cpp.obj C:\git\llvm-project\clang\lib\CodeGen\CGExpr.cpp(3808): warning C4018: '<=': signed/unsigned mismatch ```	2024-01-25 09:34:17 -05:00
Vojislav Tomasevic	2a77d92e2e	[clang] Incorrect IR involving the use of bcopy (#79298 ) This patch addresses the issue regarding the call of bcopy function in a conditional expression. It is analogous to the already accepted patch which deals with the same problem, just regarding the bzero function [0]. Here is the testcase which illustrates the issue: ``` void bcopy(const void , void , unsigned long); void foo(void); void test_bcopy() { char dst[20]; char src[20]; int _sz = 20, len = 20; return (_sz ? ((_sz >= len) ? bcopy(src, dst, len) : foo()) : bcopy(src, dst, len)); } ``` When processing it with clang, following issue occurs: Instruction does not dominate all uses! %arraydecay2 = getelementptr inbounds [20 x i8], ptr %dst, i64 0, i64 0, !dbg !38 %cond = phi ptr [ %arraydecay2, %cond.end ], [ %arraydecay5, %cond.false3 ], !dbg !33 fatal error: error in backend: Broken module found, compilation aborted! This happens because an incorrect phi node is created. It is created because bcopy function call is lowered to the call of llvm.memmove intrinsic and function memmove returns void *. Since llvm.memmove is called in two places in the same return statement, clang creates a phi node in the final basic block for the return value and that phi node is incorrect. However, bcopy function should return void in the first place, so this phi node is unnecessary. This is what this patch addresses. An appropriate test is also added and no existing tests fail when applying this patch. Also, this crash only happens when LLVM is configured with -DLLVM_ENABLE_ASSERTIONS=On option. [0] https://reviews.llvm.org/D39746	2024-01-24 09:39:36 -08:00
Mirko Brkušanin	7fdf608cef	[AMDGPU] Add GFX12 WMMA and SWMMAC instructions (#77795 ) Co-authored-by: Petar Avramovic <Petar.Avramovic@amd.com> Co-authored-by: Piotr Sobczak <piotr.sobczak@amd.com>	2024-01-24 13:43:07 +01:00
Paul Kirth	9d476e1e1a	[clang][FatLTO] Avoid UnifiedLTO until it can support WPD/CFI (#79061 ) Currently, the UnifiedLTO pipeline seems to have trouble with several LTO features, like SplitLTO units, which means we cannot use important optimizations like Whole Program Devirtualization or security hardening instrumentation like CFI. This patch reverts FatLTO to using distinct pipelines for Full LTO and ThinLTO. It still avoids module cloning, since that was error prone.	2024-01-23 14:04:52 -08:00
Sander de Smalen	1652d44d8d	[Clang] Amend SME attributes with support for ZT0. (#77941 ) This patch builds on top of #76971 and implements support for: * __arm_new("zt0") * __arm_in("zt0") * __arm_out("zt0") * __arm_inout("zt0") * __arm_preserves("zt0")	2024-01-23 12:35:16 +01:00
ManuelvOK	ce3e767ac5	[Coverage] Map regions from system headers (#76950 ) In 2155195131a57f2f01e7cfabb85bb027518c2dc6, the "system-headers-coverage" option has been added but not used in all necessary places. Potential reviewers: @gulfemsavrun @petrhosek Co-authored-by: Manuel Kalettka <manuel.kalettka@kernkonzept.com>	2024-01-22 21:41:49 -08:00
Eli Friedman	a6065f0fa5	Arm64EC entry/exit thunks, consolidated. (#79067 ) This combines the previously posted patches with some additional work I've done to more closely match MSVC output. Most of the important logic here is implemented in AArch64Arm64ECCallLowering. The purpose of the AArch64Arm64ECCallLowering is to take "normal" IR we'd generate for other targets, and generate most of the Arm64EC-specific bits: generating thunks, mangling symbols, generating aliases, and generating the .hybmp$x table. This is all done late for a few reasons: to consolidate the logic as much as possible, and to ensure the IR exposed to optimization passes doesn't contain complex arm64ec-specific constructs. The other changes are supporting changes, to handle the new constructs generated by that pass. There's a global llvm.arm64ec.symbolmap representing the .hybmp$x entries for the thunks. This gets handled directly by the AsmPrinter because it needs symbol indexes that aren't available before that. There are two new calling conventions used to represent calls to and from thunks: ARM64EC_Thunk_X64 and ARM64EC_Thunk_Native. There are a few changes to handle the associated exception-handling info, SEH_SaveAnyRegQP and SEH_SaveAnyRegQPX. I've intentionally left out handling for structs with small non-power-of-two sizes, because that's easily separated out. The rest of my current work is here. I squashed my current patches because they were split in ways that didn't really make sense. Maybe I could split out some bits, but it's hard to meaningfully test most of the parts independently. Thanks to @dpaoliello for extensive testing and suggestions. (Originally posted as https://reviews.llvm.org/D157547 .)	2024-01-22 21:28:07 -08:00
Alan Phipps	424b9cf41a	[Coverage][clang] Ensure bitmap for ternary condition is updated before visiting children (#78814 ) This is a fix for MC/DC issue https://github.com/llvm/llvm-project/issues/78453 in which a ConditionalOperator that evaluates a complex condition was incorrectly updating its global bitmap after visiting its LHS and RHS children. This was wrong because if the LHS or RHS also evaluate a complex condition, the MCDC temporary bitmap value will get corrupted. The fix is to ensure that the bitmap is updated prior to visiting the LHS and RHS.	2024-01-22 16:33:20 -06:00
Jan Patrick Lehr	fa4780fa6c	[OpenMP][USM] Introduces -fopenmp-force-usm flag (#76571 ) This flag forces the compiler to generate code for OpenMP target regions as if the user specified the #pragma omp requires unified_shared_memory in each source file. The option does not have a -fno-* friend since OpenMP requires the unified_shared_memory clause to be present in all source files. Since this flag does no harm if the clause is present, it can be used in conjunction. My understanding is that USM should not be turned off selectively, hence, no -fno- version. This adds a basic test to check the correct generation of double indirect access to declare target globals in USM mode vs non-USM mode. Which I think is the only difference observable in code generation. This runtime test checks for the (non-)occurence of data movement between host and device. It does one run without the flag and one with the flag to also see that both versions behave as expected. In the case w/o the new flag data movement between host and device is expected. In the case with the flag such data movement should not be present / reported.	2024-01-22 21:59:26 +01:00
Zahira Ammarguellat	364a5b5b85	Fix a bug in implementation of Smith's algorithm used in complex div. (#78330 ) This patch fixes a bug in Smith's algorithm (thanks to @andykaylor who detected it) and makes sure that last option in command line rules.	2024-01-22 15:50:24 -05:00
Dani	1be0d9d7d8	[AArch64][Clang] Fix linker error for function multiversioning (#74358 ) AArch64 part of https://github.com/llvm/llvm-project/pull/71706. Default version is now mangled with .default. Resolver for the TargetVersion need to be emitted from the CodeGenModule::EmitMultiVersionFunctionDefinition.	2024-01-22 19:55:16 +01:00
Matthew Devereau	6ba62f4f25	[AArch64][SME2] Refine fcvtu/fcvts/scvtf/ucvtf (#77947 ) Rename intrinsics for fcvtu to fcvtzu and fcvts to fcvtzs. Use llvm_anyvector_ty for both multi vector returns and operands, therefore the return and operands can be specified in the intrinsic call, e.g. @llvm.aarch64.sve.scvtf.x4.nxv4f32.nxv4i32	2024-01-22 15:11:49 +00:00
Hana Dusíková	865e4a1f33	[coverage] skipping code coverage for 'if constexpr' and 'if consteval' (#78033 ) `if constexpr` and `if consteval` conditional statements code coverage should behave more like a preprocesor `#if`-s than normal ConditionalStmt. This PR should fix that. --------- Co-authored-by: cor3ntin <corentinjabot@gmail.com>	2024-01-22 12:50:20 +01:00
David Chisnall	f36845d0c6	Enable direct methods and fast alloc calls for libobjc2. (#78030 ) These will be supported in the upcoming 2.2 release and so are gated on that version. Direct methods call `objc_send_initialize` if they are class methods that may not have called initialize. This is guarded by checking for the class flag bit that is set on initialisation in the class. This bit now forms part of the ABI, but it's been stable for 30+ years so that's fine as a contract going forwards.	2024-01-22 08:38:41 +00:00
Andrey Ali Khan Bolshakov	5518a9d767	[c++20] P1907R1: Support for generalized non-type template arguments of scalar type. (#78041 ) Previously committed as 9e08e51a20d0d2b1c5724bb17e969d036fced4cd, and reverted because a dependency commit was reverted, then committed again as 4b574008aef5a7235c1f894ab065fe300d26e786 and reverted again because "dependency commit" 5a391d38ac6c561ba908334d427f26124ed9132e was reverted. But it doesn't seem that 5a391d38ac6c was a real dependency for this. This commit incorporates 4b574008aef5a7235c1f894ab065fe300d26e786 and 18e093faf726d15f210ab4917142beec51848258 by Richard Smith (@zygoloid), with some minor fixes, most notably: - `UncommonValue` renamed to `StructuralValue` - `VK_PRValue` instead of `VK_RValue` as default kind in lvalue and member pointer handling branch in `BuildExpressionFromNonTypeTemplateArgumentValue`; - handling of `StructuralValue` in `IsTypeDeclaredInsideVisitor`; - filling in `SugaredConverted` along with `CanonicalConverted` parameter in `Sema::CheckTemplateArgument`; - minor cleanup in `TemplateInstantiator::transformNonTypeTemplateParmRef`; - `TemplateArgument` constructors refactored; - `ODRHash` calculation for `UncommonValue`; - USR generation for `UncommonValue`; - more correct MS compatibility mangling algorithm (tested on MSVC ver. 19.35; toolset ver. 143); - IR emitting fixed on using a subobject as a template argument when the corresponding template parameter is used in an lvalue context; - `noundef` attribute and opaque pointers in `template-arguments` test; - analysis for C++17 mode is turned off for templates in `warn-bool-conversion` test; in C++17 and C++20 mode, array reference used as a template argument of pointer type produces template argument of UncommonValue type, and `BuildExpressionFromNonTypeTemplateArgumentValue` makes `OpaqueValueExpr` for it, and `DiagnoseAlwaysNonNullPointer` cannot see through it; despite of "These cases should not warn" comment, I'm not sure about correct behavior; I'd expect a suggestion to replace `if` by `if constexpr`; - `temp.arg.nontype/p1.cpp` and `dr18xx.cpp` tests fixed.	2024-01-21 21:28:57 +01:00

1 2 3 4 5 ...

16683 Commits