llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-23 22:36:07 +00:00

Author	SHA1	Message	Date
Mark de Wever	941f7cbf5a	[libc++][TZDB] Fixes mapping of nonexisting time. (#127330 ) All non-existing local times in a contiguous range should map to the same time point. This fixes a bug, were the times inside the range were mapped to the wrong time. Fixes: #113654	2025-02-17 19:08:07 +01:00
Adam Siemieniuk	74656476b8	[mlir][x86vector] Fix integration tests lowering (#124934 ) Fixes MLIR lowering passes in x86vector integration tests. The tests are refactored with lowering pass bundle which ensures that all dialect are lowered into LLVM dialect. This simplifies the test pipelines and addresses missing arith lowering.	2025-02-17 19:01:25 +01:00
Craig Topper	62254f6615	[Targets] Move *TargetStreamer.h files into their MCTargetDesc directory. (#127433 ) These files are included from MCTargetDesc so should be there instead of in the main directory for the target.	2025-02-17 09:51:01 -08:00
Craig Topper	85f7ec12b8	[RISCV] Remove unneeded unmasked patterns for vcpop_v and riscv_vfirst_vl. (#127435 ) The pseudos had RISCVMaskedPseudo add in #115162 so I we are able to convert the masked form to unmasked form automatically.	2025-02-17 09:50:01 -08:00
Marius Kamp	2dda529838	[AArch64] Fix Fold of Compare with Right-shifted Value (#127209 ) This change folds (setcc ne (lshr x c) 0) for 64-bit types and constants c >= 32. This fold already existed for other types or smaller constants but was not applicable to 64-bit types and constants >= 32 due to a comparison of the constant c with the bit size of the setcc operation. The type of this operation is legalized to i32, which does not necessarily match the type of the lshr operation. Use the bit size of the type of the lshr operation instead for the comparison. Fixes #122380.	2025-02-17 17:44:08 +00:00
Fraser Cormack	15c2d1b328	[libclc] Fix dependencies on generated convert builtins (#127515 ) In #127378 it was reported that builds without clspv targets enabled were failing after #124727, as all targets had a dependency on a file that only clspv targets generated. A quick fix was merged in #127315 which wasn't correct. It moved the dependency on those generated files to the spirv targets, instead of onto the clspv targets. This means a build with spirv targets and without clspv targets would see the same problems as #127378 reported. I tried simply removing the requirement to explicitly add dependencies to the custom command, relying instead on the file-level dependencies. This didn't seem reliable enough; in some cases on a Makefiles build, the clang command compiling (e.g.,) convert.cl would begin before the file was fully written. Instead, we keep the target-level dependency but automatically infer it based on the generated file name, to avoid manual book-keeping of pairs of files and targets. This commit also fixes what looks like an unintended bug where, when ENABLE_RUNTIME_SUBNORMAL was enabled, the OpenCL conversions weren't being compiled.	2025-02-17 17:36:02 +00:00
Ramkumar Ramachandra	6d86a8a1a1	LAA: scope responsibility of isNoWrapAddRec (NFC) (#127479 ) Free isNoWrapAddRec from the AddRec check, and rename it to isNoWrapGEP.	2025-02-17 16:58:09 +00:00
Kazu Hirata	6a3007683b	[Analysis] Remove getGuaranteedNonPoisonOps (#127461 ) commit 0517772b4ac20c5d3a0de0d4703354a179833248 Author: Philip Reames <preames@rivosinc.com> Date: Thu Dec 19 14:14:11 2024 -0800	2025-02-17 08:26:33 -08:00
Shilei Tian	8aff59d3f4	[NFC][AMDGPU] Auto generate check lines for three test cases (#127352 ) - `CodeGen/AMDGPU/spill_more_than_wavesize_csr_sgprs.ll` - `CodeGen/AMDGPU/call-preserved-registers.ll` - `CodeGen/AMDGPU/stack-realign.ll` This is to make preparation for another PR.	2025-02-17 11:22:08 -05:00
Simon Pilgrim	d29045622a	[X86] combineConcatVectorOps - fold concat(EXTEND_VECTOR_INREG(x),EXTEND_VECTOR_INREG(y)) -> EXTEND_VECTOR_INREG(unpack(x,y)) (#127502 ) Concat/unpack the src subvectors together in the bottom 128-bit vector and then extend with a single EXTEND/EXTEND_VECTOR_INREG instruction Required the getEXTEND_VECTOR_INREG helper to be tweaked to accept EXTEND_VECTOR_INREG opcodes as well to avoid us having to remap the opcode between both types.	2025-02-17 15:50:25 +00:00
Pranav Bhandarkar	c5ea469f4d	[OMPIRBuilder] - Fix emitTargetTaskProxyFunc to not generate empty functions (#126958 ) This is a fix for https://github.com/llvm/llvm-project/issues/126949 There are two issues being fixed here. First, in some cases, OMPIRBuilder generates empty target task proxy functions. This happens when the target kernel doesn't use any stack-allocated data (either no data or only globals). The second problem is encountered when the target task i.e the code that makes the target call spans a single basic block. This usually happens when we do not generate a target or device kernel launch and instead fall back to the host. In such cases, we end up not outlining the target task entirely. This can cause us to call target kernel twice - once via the target task proxy function and a second time via the host fallback This PR fixes both of these problems and updates some tests to catch these problems should this patch fail.	2025-02-17 09:45:06 -06:00
Florian Hahn	6c627831f9	[VPlan] Use VPlan predecessors in VPWidenPHIRecipe (NFC). (#126388 ) Update VPWidenPHIRecipe to use the predecessors in VPlan to determine the incoming blocks instead of tracking them separately. This brings VPWidenPHIRecipe in line with the other phi recipes. PR: https://github.com/llvm/llvm-project/pull/126388	2025-02-17 16:40:37 +01:00
Brian Cain	788cb725d8	[Hexagon] Explicitly truncate constant in UAddSubO (#127360 ) After #117558 landed, this code would assert "Value is not an N-bit unsigned value" in getConstant(), from a test case in zig. Co-authored-by: Craig Topper <craig.topper@sifive.com> Fixes #127296	2025-02-17 09:30:48 -06:00
Louis Dionne	ec54403522	[libc++] Synchronize a few remaining status page rows with Github issues	2025-02-17 10:26:10 -05:00
Louis Dionne	fb29f19fdb	[libc++] Synchronize status pages with Github issues list	2025-02-17 10:16:56 -05:00
Matt Arsenault	ab2d330fea	TableGen: Generate reverseComposeSubRegIndices (#127050 ) This is necessary to enable composing subregisters in peephole-opt. For now use a brute force table to find the return value. The worst case target is AMDGPU with a 399 x 399 entry table.	2025-02-17 22:11:26 +07:00
flovent	9d487050a1	[clang][analyzer] Teach the BlockInCriticalSectionChecker about O_NONBLOCK streams (#127049 ) this PR close #124474 when calling `read` and `recv` function for a non-block file descriptor or a invalid file descriptor(`-1`), it will not cause block inside a critical section. this commit checks for non-block file descriptor assigned by `open` function with `O_NONBLOCK` flag. --------- Co-authored-by: Balazs Benics <benicsbalazs@gmail.com>	2025-02-17 15:35:40 +01:00
Kareem Ergawy	919e72f251	[flang][OpenMP] Support `bind` clause for `teams loop` (#127021 ) Extends generic `loop` directive support by supporting the `bind` clause. Since semantic checking does the heavy lifting of verifying the proper usage of the clause modifier, we can simply enable code-gen for `teams loop bind(...)` without the need to differentiate between the values the the clause can accept.	2025-02-17 15:18:27 +01:00
Longsheng Mou	4e41e9ac4c	[mlir] Update docs for Greedy Pattern Rewrite Driver(NFC) (#126701 ) The `applyOpPatternsAndFold` is deprecated, use `applyOpPatternsGreedily` instead.	2025-02-17 15:11:49 +01:00
Matt Arsenault	18ea6c9280	AMDGPU: Stop emitting an error on illegal addrspacecasts (#127487 ) These cannot be static compile errors, and should be treated as poison. Invalid casts may be introduced which are dynamically dead. For example: ``` void foo(volatile generic int* x) { __builtin_assume(is_shared(x)); *x = 4; } void bar() { private int y; foo(&y); // violation, wrong address space } ``` This could produce a compile time backend error or not depending on the optimization level. Similarly, the new test demonstrates a failure on a lowered atomicrmw which required inserting runtime address space checks. The invalid cases are dynamically dead, we should not error, and the AtomicExpand pass shouldn't have to consider the details of the incoming pointer to produce valid IR. This should go to the release branch. This fixes broken -O0 compiles with 64-bit atomics which would have started failing in 1d0370872f28ec9965448f33db1b105addaf64ae.	2025-02-17 21:03:50 +07:00
Mehdi Amini	d25becaa20	[MLIR][Doc] Update the pass infra doc to advise against `let constructor =` (NFC) We should avoid specifying it manually and instead rely on TableGen, see also cleanups in #127403	2025-02-17 14:56:21 +01:00
Un1q32	949e4041c9	[libc++] Add watchOS and tvOS checks for aligned_alloc (#126862 ) Adds the equivalent watchOS and tvOS version checks to check for support for aligned_alloc, we already have macOS and iOS checks.	2025-02-17 08:55:14 -05:00
Haojian Wu	9c49b188b8	[clang] Fix false positive regression for lifetime analysis warning. (#127460 ) This fixes a false positive caused by #114044. For `GSLPointer*` types, it's less clear whether the lifetime issue is about the GSLPointer object itself or the owner it points to. To avoid false positives, we take a conservative approach in our heuristic. Fixes #127195 (This will be backported to release 20).	2025-02-17 14:40:31 +01:00
Yuriy Chernyshov	f4206f92c5	[libunwind] Silence -Wunused-parameter warnings in Unwind-wasm.c (#125412 )	2025-02-17 08:39:18 -05:00
Abhina Sree	81a8b20045	[SystemZ][z/OS] Define _XOPEN_SOURCE=600 for dlopen (#127254 ) On z/OS, dlopen is guarded by _XOPEN_SOURCE=600 so define it when checking for the symbol.	2025-02-17 07:57:09 -05:00
Ramkumar Ramachandra	8eba128b2d	ConstRange: exhaustively test makeExactICmpRegion (#127058 ) Exhaustively test makeExactICmpRegion by comparing makeAllowedICmpRegion against makeSatisfyingICmpRegion for all APInts.	2025-02-17 12:30:07 +00:00
Dinu Blanovschi	9c9157b256	Fix typo in LangImpl03.rst (#127389 )	2025-02-17 12:11:36 +00:00
Michael Kruse	4c4fc4650f	[Flang-RT] Build libflang_rt.so (#121782 ) Under non-Windows platforms, also create a dynamic library version of the runtime. Build of either version of the library can be switched on using FLANG_RT_ENABLE_STATIC=ON respectively FLANG_RT_ENABLE_SHARED=ON. Default is to build only the static library, consistent with previous behaviour. This is because the way the flang driver invokes the linker, most linkers choose the dynamic library by default, if available. Building the dynamic library therefore causes flang-built executables to depend on `libflang_rt.so`, unless explicitly told otherwise.	2025-02-17 12:53:12 +01:00
Victor Campos	43d308dd0d	[compiler-rt] Add support for big endian for Arm's __negdf2vfp (#127096 ) In soft floating-point ABI, this function takes the double argument as a pair of registers r0 and r1. The ordering of these two registers follow the endianness rules, therefore the register on which the bit flipping must happen depends on the endianness.	2025-02-17 11:43:36 +00:00
Ricardo Jesus	80b08d1bb8	[TableGen] Add support for per-write cycle tunables (#125870 ) This patch adds support for describing per-write resource cycle counts for ReadAdvance records via a new optional field called `tunables`. This makes it possible to declare ReadAdvance records such as: def : ReadAdvance<Read_C, 1, [Write_A, Write_B], [2]>; The above will effectively declare two entries in the ReadAdvance table for Read_C, one for Write_A with a cycle count of 1+2, and one for Write_B with a cycle count of 1+0 (omitted values are assumed 0). The field `tunables` provides a list of deltas relative to the base `cycle` count of the ReadAdvance. Since the field is optional and defaults to a list of 0's, this change doesn't affect current targets.	2025-02-17 11:32:47 +00:00
gdehame	a177be5528	[mlir][Linalg] Bugfix in decompose generic by unfolding permutation (#126737 ) The pattern was returning success() by default which made the greedy pattern application act as if the IR was modified and even though nothing was changed and thus it can prevent it from converging for no legitimate reason. The patch makes the rewrite pattern return failure() by default and success() if and only if the IR changed. An example of unexpected behavior is by running `mlir-opt input.mlir --linalg-specialize-generic-ops`, we obtain an empty mlir as output with `input.mlir` as follows: ``` #map = affine_map<(d0) -> (d0)> func.func @f(%arg0: tensor<8xi32>, %arg1: tensor<8xi32>) -> tensor<8xi32> { %0 = tensor.empty() : tensor<8xi32> %1 = linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel"]} ins(%arg0, %arg1: tensor<8xi32>, tensor<8xi32>) outs(%0: tensor<8xi32>) { ^bb0(%in: i32, %in_0: i32, %out: i32): %2 = arith.addi %in, %in_0: i32 linalg.yield %2: i32 } -> tensor<8xi32> return %1 : tensor<8xi32> } ```	2025-02-17 11:20:58 +00:00
Guray Ozen	837b89fc0f	[MLIR][NVVM] Add `ptxas-cmd-options` to pass flags to the downstream compiler (#127457 ) This PR adds `cmd-options` to the `gpu-lower-to-nvvm-pipeline` pipeline and the `nvvm-attach-target` pass, allowing users to pass flags to the downstream compiler, ptxas. Example: ``` mlir-opt -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_80 ptxas-cmd-options='-v --register-usage-level=8'" ```	2025-02-17 12:09:27 +01:00
Jonas Paulsson	02c44ce6c6	Reformat reglists in SystemZMCTargetDesc.cpp (NFC) (#127472 )	2025-02-17 11:55:09 +01:00
Andrzej Warzyński	517800e37e	[mlir][tensor][linalg] Move Pack/UnPack Ops to Linalg (#123902 ) Moves `PackOp` and `UnPackOp` from the Tensor dialect to Linalg. This change was discussed in the following RFC: * https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg This change involves significant churn but only relocates existing code - no new functionality is added. Note for Downstream Users Downstream users must update references to `PackOp` and `UnPackOp` as follows: * Code: `s/tensor::(Up)PackOp/linalg::(Un)PackOp/g` * Tests: `s/tensor.(un)pack/linalg.(un)pack/g` No other modifications should be required.	2025-02-17 10:44:27 +00:00
Simon Pilgrim	9d24f94379	[X86] combineConcatVectorOps - remove duplicate DAG.getContext() call. NFC.	2025-02-17 10:38:23 +00:00
Timm Baeder	f09fd94d6b	[clang][bytecode] Restructure Program::CurrentDeclaration handling (#127456 ) Properly reset to the last ID and return the current ID from getCurrentDecl().	2025-02-17 11:24:43 +01:00
josel-amd	b3028295e7	[mlir][linalg] Remove `computeStaticLoopSizes` (#124778 ) `computeStaticLoopSizes()` is functionally identical to `getStaticLoopRanges()`. Replace all uses of `computeStaticLoopSizes()` by `getStaticLoopRanges()` and remove the former.	2025-02-17 11:23:27 +01:00
Balazs Benics	f378e52ed3	[clang][analysis] Fix flaky clang/test/Analysis/live-stmts.cpp test (2nd attempt) (#127406 ) In my previous attempt (#126913) of fixing the flaky case was on a good track when I used the begin locations as a stable ordering. However, I forgot to consider the case when the begin locations are the same among the Exprs. In an `EXPENSIVE_CHECKS` build, arrays are randomly shuffled prior to sorting them. This exposed the flaky behavior much more often basically breaking the "stability" of the vector - as it should. Because of this, I had to revert the previous fix attempt in #127034. To fix this, I use this time `Expr::getID` for a stable ID for an Expr. Hopefully fixes #126619 Hopefully fixes #126804	2025-02-17 11:12:55 +01:00
Victor Campos	501c77da60	[LLD][ELF][ARM] Fix resolution of R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11 for big endian (#126933 ) These relocations apply to 16-bit Thumb instructions, so reading 16 bits rather than 32 bits ensures the correct bits are masked and written back. This fixes the incorrect masking and aligns the relocation logic with the instruction encoding. Before this patch, 32 bits were read from the ELF object. This did not align with the instruction size of 16 bits, but the masking incidentally made it all work nonetheless. However, this was the case only in little endian. In big endian mode, the read 32-bit word had to have its bytes reversed. With this byte reordering, the masking would be applied to the wrong bits, hence causing the incorrect encoding to be produced as a result of the relocation resolution. The added test checks the result for both little and big endian modes.	2025-02-17 10:10:35 +00:00
lorenzo chelini	c1a2292526	[MLIR][NFC] Retire `let constructor` for passes in Conversion directory (part1) (#127403 ) `let constructor` is deprecated since the table gen backend emits most of the glue logic to build a pass. This PR retires the td method for most (I need another pass) passes in the Conversion directory.	2025-02-17 10:55:27 +01:00
Benjamin Maxwell	e0e67a6207	[LV] Add initial support for vectorizing literal struct return values (#109833 ) This patch adds initial support for vectorizing literal struct return values. Currently, this is limited to the case where the struct is homogeneous (all elements have the same type) and not packed. The users of the call also must all be `extractvalue` instructions. The intended use case for this is vectorizing intrinsics such as: ``` declare { float, float } @llvm.sincos.f32(float %x) ``` Mapping them to structure-returning library calls such as: ``` declare { <4 x float>, <4 x float> } @Sleef_sincosf4_u10advsimd(<4 x float>) ``` Or their widened form (such as `@llvm.sincos.v4f32` in this case). Implementing this required two main changes: 1. Supporting widening `extractvalue` 2. Adding support for vectorized struct types in LV * This is mostly limited to parts of the cost model and scalarization Since the supported use case is narrow, the required changes are relatively small.	2025-02-17 09:51:35 +00:00
Haojian Wu	262e4c1987	Revert "[clang][Modules] Remove a resloved issue from StandardCPlusPlusModules.rst" This reverts commit 82dc2d403066a84ef0051b06f1d179e00331f319. The fix has been reverted in f63e8ed16ef1fd2deb80cd88b5ca9d5b631b1c36	2025-02-17 10:47:43 +01:00
Haojian Wu	82dc2d4030	[clang][Modules] Remove a resloved issue from StandardCPlusPlusModules.rst The issue has been fixed in https://github.com/llvm/llvm-project/pull/122726	2025-02-17 10:46:11 +01:00
Kazu Hirata	d49776634e	[Hexagon] Avoid repeated map lookups (NFC) (#127447 )	2025-02-17 01:32:47 -08:00
Kazu Hirata	fb14638817	[DebugInfo] Avoid repeated hash lookups (NFC) (#127446 )	2025-02-17 01:32:25 -08:00
Kazu Hirata	ff4e21fccc	[clang-tidy] Avoid repeated hash lookups (NFC) (#127444 )	2025-02-17 01:31:52 -08:00
Kazu Hirata	b9c6d3ed26	[clang-linker-wrapper] Avoid repeated hash lookups (NFC) (#127443 )	2025-02-17 01:31:24 -08:00
Kazu Hirata	e0545b5c6d	[Analysis] Remove getGuaranteedWellDefinedOps (#127453 ) The last use was removed in: commit ac9e67756e0157793d565c2cceaf82e4403f58ba Author: Yingwei Zheng <dtcxzyw2333@gmail.com> Date: Mon Feb 26 01:53:16 2024 +0800	2025-02-17 01:22:39 -08:00
Sam Parker	ea7897a617	[WebAssembly] Enable interleaved memory accesses (#125696 ) Enable the vectorizer to access interleaved memory. This means that, when it's decided to be profitable, the memory accesses can be vectorized instead of the value being built up by a sequence of load_lane instructions. This will often increase the vectorization factor of the loop, leading to significantly better performance. I run a reasonably large collection of benchmarks and most are not affected by this change, with most performance changes <1%. But I see a 2.5% speedup for the total run time of TSVC, 1% speedup for SPEC2017 x265, 28% speedup for a ResNet workload and 95% for libyuv. This is running V8 on an AArch64 box.	2025-02-17 09:09:52 +00:00
Sam Parker	948a8477c6	[WebAssembly] Recognise EXTEND_HIGH (#123325 ) When lowering EXTEND_VECTOR_INREG, check whether the operand is a shuffle that is moving the top half of a vector into the lower half. If so, we can EXTEND_HIGH the input to the shuffle instead.	2025-02-17 09:04:29 +00:00

1 2 3 4 5 ...

527646 Commits