llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-26 17:06:07 +00:00

Author	SHA1	Message	Date
Matt Arsenault	a8e1311a1c	[RFC] IR: Define noalias.addrspace metadata (#102461 ) This is intended to solve a problem with lowering atomics in OpenMP and C++ common to AMDGPU and NVPTX. In OpenCL and CUDA, it is undefined behavior for an atomic instruction to modify an object in thread private memory. In OpenMP, it is defined. Correspondingly, the hardware does not handle this correctly. For AMDGPU, 32-bit atomics work and 64-bit atomics are silently dropped. We therefore need to codegen this by inserting a runtime address space check, performing the private case without atomics, and fallback to issuing the real atomic otherwise. This metadata allows us to avoid this extra check and branch. Handle this by introducing metadata intended to be applied to atomicrmw, indicating they cannot access the forbidden address space.	2024-10-07 23:21:42 +04:00
Sandeep Dasgupta	90a5744beb	Remove redundant checks related to quantized type (#110604 ) [APFloat::getSmallest](`915df1ae41/llvm/include/llvm/ADT/APFloat.h (L1060)`) (and similarly `APFloat:getLargest`) ``` APFloat getSmallest(const fltSemantics &Sem, bool Negative = false); ``` return the positive number when the default value for the second argument is used. With that being said, the check [QuantTypes.cpp#L325](`96f37ae453/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp (L325)`) ```c++ if (scale <= 0.0 \|\| std::isinf(scale) \|\| std::isnan(scale)) return emitError() << "illegal scale: " << scale; ``` is already covered by the check which follows [QuantTypes.cpp#L327](`96f37ae453/mlir/lib/Dialect/Quant/IR/QuantTypes.cpp (L327)`) ```c++ if (scale < minScale \|\| scale > maxScale) return emitError() << "scale out of expressed type range [" << minScale << ", " << maxScale << "]"; ``` given that range `[positive-smallest-finite-number, positive-largest-finite-number]` does not include `inf` and `nan`s. I propose to remove the redundant check. Any suggestion for improving the error message is welcome.	2024-10-07 15:16:29 -04:00
Sterling-Augustine	3f5039323c	[SandboxVectorizer][NFC] Remove unused include (#111418 )	2024-10-07 11:47:00 -07:00
hill	adbc37d999	[clang-tidy] Fix incorrect command-line option in docs (#111405 ) Updated the `HeaderFilterRegex` description to reference `--header-filter` instead of the incorrect `--header-filter-regex` in the clang-tidy documentation.	2024-10-07 20:37:00 +02:00
Philip Reames	f11568bcb0	Revert "[RISCV][TTI] Recognize CONCAT_VECTORS if a shufflevector mask is multiple insert subvector. (#110457 )" This reverts commit 554eaec63908ed20c35c8cc85304a3d44a63c634. Change was not approved when landed.	2024-10-07 11:31:57 -07:00
Vladislav Dzhidzhoev	32e90bbe57	[lldb][test] Support remote run of Shell tests (#95986 ) 1. This commit adds LLDB_TEST_PLATFORM_URL, LLDB_TEST_SYSROOT, LLDB_TEST_PLATFORM_WORKING_DIR, LLDB_SHELL_TESTS_DISABLE_REMOTE cmake flags to pass arguments for cross-compilation and remote running of both Shell&API tests. 2. To run Shell tests remotely, it adds 'platform select' and 'platform connect' commands to %lldb substitution. 3. 'remote-linux' feature added to lit to disable tests failing with remote execution. 4. A separate working directory is assigned to each test to avoid conflicts during parallel test execution. 5. Remote Shell testing is run only when LLDB_TEST_SYSROOT is set for building test sources. The recommended compiler for that is Clang. --------- Co-authored-by: Vladimir Vereschaka <vvereschaka@accesssoftek.com>	2024-10-07 20:31:33 +02:00
Sterling-Augustine	93bfa7886b	[SandboxVectorizer] Define SeedBundle: a set of instructions to be vectorized [retry] (#111073 ) [Retry 110696 with a proper rebase.] Seed collection will assemble instructions to be vectorized into SeedBundles. This data structure is not intended to be used directly, but will be the basis for load bundles, store bundles, and so on.	2024-10-07 11:20:50 -07:00
Oleksandr T.	41b09c5346	[Clang] omit parentheses in fold expressions with a single expansion (#110761 ) Fixes #101863	2024-10-07 20:14:46 +02:00
Benoit Jacob	d8a656ffaf	[MLIR] AMDGPUToROCDL: Use a bitcast op to reintepret a vector of i8 as single integer. (#111400 ) Found by inspecting AMDGPU assembly - so the arithmetic ops created there were definitely making their way into the target ISA. A `LLVM::BitcastOp` seems equivalent, and evaporates as expected in the target asm. Along the way, I thought that this helper function `mfmaConcatIfNeeded` could be renamed to `convertMFMAVectorOperand` to better convey its contract; so I don't need to think about whether a bitcast is a legitimate "concat" :-) --------- Signed-off-by: Benoit Jacob <jacob.benoit.1@gmail.com>	2024-10-07 14:14:18 -04:00
Paul Kirth	fabe7e39df	[llvm][gold] Fix syntax error (#111412 ) This seems to have been overlooked in #109847, probably because most bots don't build w/ gold enabled.	2024-10-07 10:56:52 -07:00
Alexandros Lamprineas	1297ff1765	[FMV][AArch64][NFC] Cleanup attribute metadata from test files. (#111386 ) We have a dedicated test to check the target-features for FMV (clang/test/CodeGen/aarch64-fmv-dependencies.c) therefore I am removing the autogenerated checks from irrelevant tests since the noise is making it harder to review actual codegen changes.	2024-10-07 18:54:53 +01:00
Justin Bogner	0b8fec6946	[DirectX] Fix broken test and accidental fallthrough in #110616 (#111410 ) Fix an obvious typo in these tests to get them passing, and also fix the -Wimplicit-fallthrough warning that fires when trying to build. Reverting #110616 was tricky because of dependencies, so I'm just doing the easy fix directly here.	2024-10-07 10:33:35 -07:00
Heejin Ahn	69577b2454	[WebAssembly] Support type checker for new EH (#111069 ) This adds supports for the new EH instructions (`try_table` and `throw_ref`) to the type checker. One thing I'd like to improve on is the locations in the errors for `catch_*` clauses. Currently they just point to the starting column of `try_table` instruction itself. But to figure out where catch clauses start you need to traverse `OperandVector` and check `WebAssemblyOperand::isCatchList` on them to see which one is the catch list operand, but `WebAssemblyOperand` class is in AsmParser and AsmTypeCheck does not have access to it: `cdfdc857cb/llvm/lib/Target/WebAssembly/AsmParser/WebAssemblyAsmParser.cpp (L43-L204)` And even if AsmTypeCheck has access to it, currently it treats the list of catch clauses as a single `WebAssemblyOperand` so there is no way to get the starting location of each `catch_*` clause in the current structure. This also renames `valTypeToStackType` to `valTypesToStackTypes`, given that it takes two type lists.	2024-10-07 10:29:26 -07:00
Nimish Mishra	71b2c4dbc7	[flang] Remove failing integration test	2024-10-07 22:58:19 +05:30
Rainer Orth	b3c1403dc0	[clang] Fix std::tm etc. mangling on Solaris (#106353 ) Recently, Solaris bootstrap got broken because Solaris uses a non-standard mangling of `std::tm` and a few others. This was fixed with a hack in PR #100724. The Solaris ABI requires mangling `std::tm` as `tm` and similarly for `std::div_t`, `std::ldiv_t`, and `std::lconv`, which is what this patch implements. The hack needs to stay in place to allow building with older versions of `clang`. Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11` (2-stage builds with both `clang-19` and `gcc-14` as build compiler), and `x86_64-pc-linux-gnu`.	2024-10-07 19:05:23 +02:00
Prashant Kumar	971b579bc6	[MLIR] Don't drop attached discardable attributes (#111261 ) The creation of pack op was dropping discardable attributes.	2024-10-07 22:21:30 +05:30
Jacob Lalonde	5d372ea6a1	[LLDB][DYLD] Remove logic around not rebasing when main executable has a load address (#110885 ) This is a part of #109477 that I'm making into it's own patch. Here we remove logic from the DYLD that prevents it's logic from running if the main executable already has a load address. Instead we let the DYLD fully determine what should be loaded and what shouldn't.	2024-10-07 09:45:56 -07:00
Da-Viper	d4c1789112	Make env and source map dictionaries #95137 (#106919 ) Fixes #95137	2024-10-07 12:38:36 -04:00
Simon Pilgrim	f07e1c8619	[clang][x86] Update MMX intrinsic tests for both C/C++ and 32/64-bit targets	2024-10-07 17:36:18 +01:00
Simon Pilgrim	b795c28d8e	[clang][x86] Update AVX512F intrinsic tests for both C/C++ Requires some better checking of labels and some call instructions to handle additional markers	2024-10-07 17:36:17 +01:00
Justin Bogner	b2c615fc79	Reapply "[SPIRV] Add radians intrinsic" I had too many tabs open and reverted #110800 by mistake, it was supposed to be #110616. This reverts commit dec6fe3de0f6475ea83391e5b0b4036cf56db35b.	2024-10-07 09:24:34 -07:00
Justin Bogner	dec6fe3de0	Revert "[SPIRV] Add radians intrinsic" (#111398 ) Reverts llvm/llvm-project#110800 `llvm\test\CodeGen\DirectX\radians.ll` is failing after this change. @adam-yang please send a new PR with the issue resolved once you've had time to investigate.	2024-10-07 09:23:38 -07:00
Rahman Lavaee	1f17c2d20d	[LLD] Deprecate --lto-basic-block-sections=labels (#110697 ) This option is now replaced by `--lto-basic-block-address-map`.	2024-10-07 09:22:36 -07:00
Kazu Hirata	02b9c97b75	[memprof] Simplify code with MapVector::operator[] (NFC) (#111335 ) Note that the following are all equivalent to each other: Map.insert({Key, Value()}).first->second Map.try_emplace(Key).first->second Map[Key]	2024-10-07 09:00:05 -07:00
Michael Buch	5e7cc37422	[lldb][test] TestDataFormatterLibcxxOptionalSimulator.py: don't use __builtin_printf This caused Windows CI to fail with: ``` Build Command Output: make: Entering directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/optional/TestDataFormatterLibcxxOptionalSimulator.test_r0' "C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang++.exe" -std=c++11 -gdwarf -O0 -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\data-formatter\data-formatter-stl\libcxx-simulators\optional -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info -fno-exceptions -D_HAS_EXCEPTIONS=0 -fms-compatibility-version=19.0 -std=c++14 -DREVISION=0 -std=c++14 --driver-mode=g++ -MT main.o -MD -MP -MF main.d -c -o main.o C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\data-formatter\data-formatter-stl\libcxx-simulators\optional/main.cpp C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe main.o -gdwarf -O0 -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\functionalities\data-formatter\data-formatter-stl\libcxx-simulators\optional -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info -fuse-ld=lld --driver-mode=g++ -o "a.out" lld-link: error: undefined symbol: printf >>> referenced by main.o:(main) clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [Makefile.rules:515: a.out] Error 1 make: Leaving directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/functionalities/data-formatter/data-formatter-stl/libcxx-simulators/optional/TestDataFormatterLibcxxOptionalSimulator.test_r0' ```	2024-10-07 17:38:14 +02:00
Vladislav Dzhidzhoev	0e8a10b099	[lldb][test] Mark test() in TestBSDArchives.py as passing remotely (#111199 ) It was xfail'ed in de2ddc8f3146b. However, it passes on a buildbot https://lab.llvm.org/staging/#/builders/195/builds/3988.	2024-10-07 17:37:44 +02:00
RipleyTom	c5f7a32356	[X86] Add AMD Llano family detection (#111312 ) Very simple one liner, adds the missing detection for the Llano family which is essentially a refreshed K10: Documentation of the family id: https://en.wikichip.org/wiki/amd/cpuid#Family_18_.2812h.29 Documentation that it fits into amdfam10: https://en.wikipedia.org/wiki/AMD_10h#12h	2024-10-07 08:33:26 -07:00
Nuri Amari	2edd897a42	Make WriteIndexesThinBackend multi threaded (#109847 ) We've noticed that for large builds executing thin-link can take on the order of 10s of minutes. We are only using a single thread to write the sharded indices and import files for each input bitcode file. While we need to ensure the index file produced lists modules in a deterministic order, that doesn't prevent us from executing the rest of the work in parallel. In this change we use a thread pool to execute as much of the backend's work as possible in parallel. In local testing on a machine with 80 cores, this change makes a thin-link for ~100,000 input files run in ~2 minutes. Without this change it takes upwards of 10 minutes. --------- Co-authored-by: Nuri Amari <nuriamari@fb.com>	2024-10-07 08:16:46 -07:00
Alex Bradbury	2fe1f84db3	[test] Fix llc-start-stop.ll when the default target enables the loop terminator folding pass Previously this would fail if the default target enabled the loop terminator folding pass (currently just RISC-V), as it runs after loop strength reduction.	2024-10-07 16:06:44 +01:00
Aaron Ballman	8565213f2f	[clang] Code owners -> Maintainers transition (#108997 ) This is the initial transition from using "code owners" to using "maintainers".	2024-10-07 11:00:54 -04:00
Simon Pilgrim	f15fe73d23	[clang][x86] Update AVX2 intrinsic tests for both C/C++ Requires some call instructions to handle additional markers	2024-10-07 15:57:22 +01:00
Simon Pilgrim	aa4d94839e	[clang][x86] Update FMA/FMA4 intrinsic tests for both C/C++ and 32/64-bit targets Requires some call instructions to handle additional markers	2024-10-07 15:57:22 +01:00
Simon Pilgrim	f71d62178d	[clang][x86] Update F16C intrinsic tests for both C/C++ and 32/64-bit targets Requires some call instructions to handle additional markers	2024-10-07 15:57:22 +01:00
Simon Pilgrim	aa656366ce	[clang][x86] Update AVX1 intrinsic tests for both C/C++ Requires some call instructions to handle additional markers	2024-10-07 15:57:21 +01:00
Pavel Labath	0e2970f0ad	[lldb] Make libc++ simulator tests compatible with category-based ski… (#111353 ) …pping, which works by setting a field on the function object. This doesn't work on a functools.partial object. Use a real function instead.	2024-10-07 16:53:26 +02:00
David Spickett	0e8555d4db	[libclc] Remove mention of BSD license in readme (#111371 ) This seems to be an artifact from the intial import in 2012, but even if not, folks are better off reading the LICENSE.TXT file for the full details if they need them. Fixes #109968	2024-10-07 15:26:04 +01:00
Michael Maitland	989c437d7f	[RISCV][GISEL][NFC] Add break statement to reduce diff on future changes of preISelLower	2024-10-07 07:23:52 -07:00
Juan Manuel Martinez Caamaño	d5ec01b0dd	Revert "[NFC][EarlyIfConverter] Replace boolean Predicate for a class (#108519 )" (#111372 ) This reverts commit 9e7315912656628b606e884e39cdeb261b476f16.	2024-10-07 16:03:11 +02:00
Matt Arsenault	5f94b0cbdd	AMDGPU: Try to reuse dest reg for s_add_i32 frame indexes (#111201 ) Hack around the register scavenger doing the wrong thing. It does not find the result register as available in the case the frame index add isn't also reading the dest register. This is the quick fix for a regression where the scavenge would create a broken spill of SGPR to memory. I believe this is still broken for cases we cannot use the result register. I'm confused about what position the scavenger iterator is supposed to be in, and what RestoreAfter is for. The scavenger is missing a full set of forward/backward APIs and there seems to be an off by one somewhere.	2024-10-07 18:01:24 +04:00
Kazu Hirata	5fdda41474	[Transforms] Avoid repeated hash lookups (NFC) (#111329 )	2024-10-07 07:01:07 -07:00
Kazu Hirata	0614b3cfac	[Analysis] Simplify code with DenseMap::operator[] (NFC) (#111331 )	2024-10-07 07:00:45 -07:00
Alexandros Lamprineas	40f0f7b4ec	[FMV][AArch64] Unify features ssbs and ssbs2. (#110297 ) According to https://developer.arm.com/documentation/102105/latest Arm Architecture Reference Manual for A-profile architecture: Known issues 2.206 D22789 In section C5.2.25 "SSBS, Speculative Store Bypass Safe", under the heading 'Configurations', the text that reads: "This register is present only when FEAT_SSBS is implemented. Otherwise, direct accesses to SSBS are UNDEFINED." is changed to read: "This register is present only when FEAT_SSBS2 is implemented. Otherwise, direct accesses to SSBS are UNDEFINED." This suggests that it's not worth splitting FEAT_SSBS2 from FEAT_SSBS in the compiler, since FEAT_SSBS cannot be used for predicating the MRS/MSR instructions. Those can access PSTATE.SSBS only when FEAT_SSBS2 is available. Moreover, there are no hardware implementations which implement FEAT_SSBS without FEAT_SSBS2, therefore unifying these features in the specification should not be a regression for feature detection. Approved in ACLE as https://github.com/ARM-software/acle/pull/350	2024-10-07 15:00:08 +01:00
Kazu Hirata	31e8c539e0	[Affine] Avoid repeated hash lookups (NFC) (#111330 )	2024-10-07 06:59:23 -07:00
Kazu Hirata	4bc0916011	[Linalg] Avoid repeated hash lookups (NFC) (#111328 )	2024-10-07 06:56:22 -07:00
Kazu Hirata	4c9c2d6082	[AST] Avoid repeated hash lookups (NFC) (#111327 ) Here I'm splitting up the existing "if" statement into two. Mixing hasDefinition() and insert() in one "if" condition would be extremely confusing as hasDefinition() doesn't change anything while insert() does.	2024-10-07 06:55:56 -07:00
BARRET	1666d13078	[CMake]: Remove unnecessary dependencies on LLVM/MLIR (#111255 ) Previous https://github.com/llvm/llvm-project/pull/110362 (reverted) caused breakage. Here is the PR with fix. My build cmdline: ``` cmake ../llvm \ -G Ninja \ -DCMAKE_BUILD_TYPE=Release \ -DCMAKE_INSTALL_PREFIX=install \ -DCMAKE_C_COMPILER=gcc-9 \ -DCMAKE_CXX_COMPILER=g++-9 \ -DCMAKE_CUDA_COMPILER=$(which nvcc) \ -DLLVM_ENABLE_LLD=OFF \ -DLLVM_ENABLE_ASSERTIONS=ON \ -DLLVM_BUILD_EXAMPLES=ON \ -DCOMPILER_RT_BUILD_LIBFUZZER=OFF \ -DLLVM_CCACHE_BUILD=ON \ -DMLIR_ENABLE_BINDINGS_PYTHON=ON \ -DBUILD_SHARED_LIBS=ON \ -DLLVM_ENABLE_PROJECTS='llvm;mlir' ```	2024-10-07 15:52:43 +02:00
Hans Wennborg	b2784ec328	Revert "[X86] For minsize memset/memcpy, use byte or double-word accesses (#87003 )" This caused assertion failures: llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp:7736: SDValue getMemsetValue(SDValue, EVT, SelectionDAG &, const SDLoc &): Assertion `C->getAPIntValue().getBitWidth() == 8' failed. See comment on the PR for a reproducer. > repstosb and repstosd are the same size, but stosd is only done for 0 > because the process of multiplying the constant so that it is copied > across the bytes of the 32-bit number adds extra instructions that cause > the size to increase. For 0, repstosb and repstosd are the same size, > but stosd is only done for 0 because the process of multiplying the > constant so that it is copied across the bytes of the 32-bit number adds > extra instructions that cause the size to increase. For 0, we do not > need to do that at all. > > For memcpy, the same goes, and as a result the minsize check was moved > ahead because a jmp to memcpy encoded takes more bytes than repmovsb. This reverts commit 6de5305b3d7a4a19a29b35d481a8090e2a6d3a7e.	2024-10-07 15:18:29 +02:00
Simon Pilgrim	73c9ad2633	[clang][x86] Add constexpr support for some basic SSE1 intrinsics (#111001 ) This is an initial patch to enable constexpr support on the more basic SSE1 intrinsics - such as initialization, arithmetic, logic and fixed shuffles. The plan is to incrementally extend this for SSE2/AVX etc. - initially for the equivalent basic intrinsics, but we can add support for some of the ia32 builtins as well we the need arises.	2024-10-07 14:11:29 +01:00
Simon Pilgrim	5dc7a5e50b	[clang][x86] popcntintrin.h - merge the __DEFAULT_FN_ATTRS / __DEFAULT_FN_ATTRS_CONSTEXPR defines. NFC. We only need one define - so consistently use __DEFAULT_FN_ATTRS like we do in other headers.	2024-10-07 14:10:39 +01:00
Guillaume Chatelet	dda107b8cb	Revert "[libc][bazel] Enable software prefetching for memcpy" (#111370 ) Reverts llvm/llvm-project#108939 When `AVX` is available but `-mprefer-vector-width=128` some of the `mov` instructions turn into the x86 `rep;movsb` instruction leading to poor performance on "old" architectures (sandybridge, haswell). The possible solutions are : get rid of the `-mprefer-vector-width` option or use smaller static copy sizes in `inline_memcpy_x86_sse2_ge64_sw_prefetching`. Right now a copy size of 3 cache lines (192B) relying exclusively on xmm registers gets turned into `rep;movsb`.	2024-10-07 15:00:31 +02:00

1 2 3 4 5 ...

514275 Commits