llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 23:36:35 +00:00

Author	SHA1	Message	Date
Michael Kruse	4c4fc4650f	[Flang-RT] Build libflang_rt.so (#121782 ) Under non-Windows platforms, also create a dynamic library version of the runtime. Build of either version of the library can be switched on using FLANG_RT_ENABLE_STATIC=ON respectively FLANG_RT_ENABLE_SHARED=ON. Default is to build only the static library, consistent with previous behaviour. This is because the way the flang driver invokes the linker, most linkers choose the dynamic library by default, if available. Building the dynamic library therefore causes flang-built executables to depend on `libflang_rt.so`, unless explicitly told otherwise.	2025-02-17 12:53:12 +01:00
Victor Campos	43d308dd0d	[compiler-rt] Add support for big endian for Arm's __negdf2vfp (#127096 ) In soft floating-point ABI, this function takes the double argument as a pair of registers r0 and r1. The ordering of these two registers follow the endianness rules, therefore the register on which the bit flipping must happen depends on the endianness.	2025-02-17 11:43:36 +00:00
Ricardo Jesus	80b08d1bb8	[TableGen] Add support for per-write cycle tunables (#125870 ) This patch adds support for describing per-write resource cycle counts for ReadAdvance records via a new optional field called `tunables`. This makes it possible to declare ReadAdvance records such as: def : ReadAdvance<Read_C, 1, [Write_A, Write_B], [2]>; The above will effectively declare two entries in the ReadAdvance table for Read_C, one for Write_A with a cycle count of 1+2, and one for Write_B with a cycle count of 1+0 (omitted values are assumed 0). The field `tunables` provides a list of deltas relative to the base `cycle` count of the ReadAdvance. Since the field is optional and defaults to a list of 0's, this change doesn't affect current targets.	2025-02-17 11:32:47 +00:00
gdehame	a177be5528	[mlir][Linalg] Bugfix in decompose generic by unfolding permutation (#126737 ) The pattern was returning success() by default which made the greedy pattern application act as if the IR was modified and even though nothing was changed and thus it can prevent it from converging for no legitimate reason. The patch makes the rewrite pattern return failure() by default and success() if and only if the IR changed. An example of unexpected behavior is by running `mlir-opt input.mlir --linalg-specialize-generic-ops`, we obtain an empty mlir as output with `input.mlir` as follows: ``` #map = affine_map<(d0) -> (d0)> func.func @f(%arg0: tensor<8xi32>, %arg1: tensor<8xi32>) -> tensor<8xi32> { %0 = tensor.empty() : tensor<8xi32> %1 = linalg.generic {indexing_maps = [#map, #map, #map], iterator_types = ["parallel"]} ins(%arg0, %arg1: tensor<8xi32>, tensor<8xi32>) outs(%0: tensor<8xi32>) { ^bb0(%in: i32, %in_0: i32, %out: i32): %2 = arith.addi %in, %in_0: i32 linalg.yield %2: i32 } -> tensor<8xi32> return %1 : tensor<8xi32> } ```	2025-02-17 11:20:58 +00:00
Guray Ozen	837b89fc0f	[MLIR][NVVM] Add `ptxas-cmd-options` to pass flags to the downstream compiler (#127457 ) This PR adds `cmd-options` to the `gpu-lower-to-nvvm-pipeline` pipeline and the `nvvm-attach-target` pass, allowing users to pass flags to the downstream compiler, ptxas. Example: ``` mlir-opt -gpu-lower-to-nvvm-pipeline="cubin-chip=sm_80 ptxas-cmd-options='-v --register-usage-level=8'" ```	2025-02-17 12:09:27 +01:00
Jonas Paulsson	02c44ce6c6	Reformat reglists in SystemZMCTargetDesc.cpp (NFC) (#127472 )	2025-02-17 11:55:09 +01:00
Andrzej Warzyński	517800e37e	[mlir][tensor][linalg] Move Pack/UnPack Ops to Linalg (#123902 ) Moves `PackOp` and `UnPackOp` from the Tensor dialect to Linalg. This change was discussed in the following RFC: * https://discourse.llvm.org/t/rfc-move-tensor-pack-and-tensor-unpack-into-linalg This change involves significant churn but only relocates existing code - no new functionality is added. Note for Downstream Users Downstream users must update references to `PackOp` and `UnPackOp` as follows: * Code: `s/tensor::(Up)PackOp/linalg::(Un)PackOp/g` * Tests: `s/tensor.(un)pack/linalg.(un)pack/g` No other modifications should be required.	2025-02-17 10:44:27 +00:00
Simon Pilgrim	9d24f94379	[X86] combineConcatVectorOps - remove duplicate DAG.getContext() call. NFC.	2025-02-17 10:38:23 +00:00
Timm Baeder	f09fd94d6b	[clang][bytecode] Restructure Program::CurrentDeclaration handling (#127456 ) Properly reset to the last ID and return the current ID from getCurrentDecl().	2025-02-17 11:24:43 +01:00
josel-amd	b3028295e7	[mlir][linalg] Remove `computeStaticLoopSizes` (#124778 ) `computeStaticLoopSizes()` is functionally identical to `getStaticLoopRanges()`. Replace all uses of `computeStaticLoopSizes()` by `getStaticLoopRanges()` and remove the former.	2025-02-17 11:23:27 +01:00
Balazs Benics	f378e52ed3	[clang][analysis] Fix flaky clang/test/Analysis/live-stmts.cpp test (2nd attempt) (#127406 ) In my previous attempt (#126913) of fixing the flaky case was on a good track when I used the begin locations as a stable ordering. However, I forgot to consider the case when the begin locations are the same among the Exprs. In an `EXPENSIVE_CHECKS` build, arrays are randomly shuffled prior to sorting them. This exposed the flaky behavior much more often basically breaking the "stability" of the vector - as it should. Because of this, I had to revert the previous fix attempt in #127034. To fix this, I use this time `Expr::getID` for a stable ID for an Expr. Hopefully fixes #126619 Hopefully fixes #126804	2025-02-17 11:12:55 +01:00
Victor Campos	501c77da60	[LLD][ELF][ARM] Fix resolution of R_ARM_THM_JUMP8 and R_ARM_THM_JUMP11 for big endian (#126933 ) These relocations apply to 16-bit Thumb instructions, so reading 16 bits rather than 32 bits ensures the correct bits are masked and written back. This fixes the incorrect masking and aligns the relocation logic with the instruction encoding. Before this patch, 32 bits were read from the ELF object. This did not align with the instruction size of 16 bits, but the masking incidentally made it all work nonetheless. However, this was the case only in little endian. In big endian mode, the read 32-bit word had to have its bytes reversed. With this byte reordering, the masking would be applied to the wrong bits, hence causing the incorrect encoding to be produced as a result of the relocation resolution. The added test checks the result for both little and big endian modes.	2025-02-17 10:10:35 +00:00
lorenzo chelini	c1a2292526	[MLIR][NFC] Retire `let constructor` for passes in Conversion directory (part1) (#127403 ) `let constructor` is deprecated since the table gen backend emits most of the glue logic to build a pass. This PR retires the td method for most (I need another pass) passes in the Conversion directory.	2025-02-17 10:55:27 +01:00
Benjamin Maxwell	e0e67a6207	[LV] Add initial support for vectorizing literal struct return values (#109833 ) This patch adds initial support for vectorizing literal struct return values. Currently, this is limited to the case where the struct is homogeneous (all elements have the same type) and not packed. The users of the call also must all be `extractvalue` instructions. The intended use case for this is vectorizing intrinsics such as: ``` declare { float, float } @llvm.sincos.f32(float %x) ``` Mapping them to structure-returning library calls such as: ``` declare { <4 x float>, <4 x float> } @Sleef_sincosf4_u10advsimd(<4 x float>) ``` Or their widened form (such as `@llvm.sincos.v4f32` in this case). Implementing this required two main changes: 1. Supporting widening `extractvalue` 2. Adding support for vectorized struct types in LV * This is mostly limited to parts of the cost model and scalarization Since the supported use case is narrow, the required changes are relatively small.	2025-02-17 09:51:35 +00:00
Haojian Wu	262e4c1987	Revert "[clang][Modules] Remove a resloved issue from StandardCPlusPlusModules.rst" This reverts commit 82dc2d403066a84ef0051b06f1d179e00331f319. The fix has been reverted in f63e8ed16ef1fd2deb80cd88b5ca9d5b631b1c36	2025-02-17 10:47:43 +01:00
Haojian Wu	82dc2d4030	[clang][Modules] Remove a resloved issue from StandardCPlusPlusModules.rst The issue has been fixed in https://github.com/llvm/llvm-project/pull/122726	2025-02-17 10:46:11 +01:00
Kazu Hirata	d49776634e	[Hexagon] Avoid repeated map lookups (NFC) (#127447 )	2025-02-17 01:32:47 -08:00
Kazu Hirata	fb14638817	[DebugInfo] Avoid repeated hash lookups (NFC) (#127446 )	2025-02-17 01:32:25 -08:00
Kazu Hirata	ff4e21fccc	[clang-tidy] Avoid repeated hash lookups (NFC) (#127444 )	2025-02-17 01:31:52 -08:00
Kazu Hirata	b9c6d3ed26	[clang-linker-wrapper] Avoid repeated hash lookups (NFC) (#127443 )	2025-02-17 01:31:24 -08:00
Kazu Hirata	e0545b5c6d	[Analysis] Remove getGuaranteedWellDefinedOps (#127453 ) The last use was removed in: commit ac9e67756e0157793d565c2cceaf82e4403f58ba Author: Yingwei Zheng <dtcxzyw2333@gmail.com> Date: Mon Feb 26 01:53:16 2024 +0800	2025-02-17 01:22:39 -08:00
Sam Parker	ea7897a617	[WebAssembly] Enable interleaved memory accesses (#125696 ) Enable the vectorizer to access interleaved memory. This means that, when it's decided to be profitable, the memory accesses can be vectorized instead of the value being built up by a sequence of load_lane instructions. This will often increase the vectorization factor of the loop, leading to significantly better performance. I run a reasonably large collection of benchmarks and most are not affected by this change, with most performance changes <1%. But I see a 2.5% speedup for the total run time of TSVC, 1% speedup for SPEC2017 x265, 28% speedup for a ResNet workload and 95% for libyuv. This is running V8 on an AArch64 box.	2025-02-17 09:09:52 +00:00
Sam Parker	948a8477c6	[WebAssembly] Recognise EXTEND_HIGH (#123325 ) When lowering EXTEND_VECTOR_INREG, check whether the operand is a shuffle that is moving the top half of a vector into the lower half. If so, we can EXTEND_HIGH the input to the shuffle instead.	2025-02-17 09:04:29 +00:00
Simon Pilgrim	94585dc59d	[X86] Add test coverage for #116931	2025-02-17 09:03:26 +00:00
Donát Nagy	6684a5970e	[analyzer][NFC] Trivial cleanup in ArrayBoundChecker (#126941 ) Two small stylistic improvements in code that I wrote ~a year ago: 1. fix a typo in a comment; and 2. simplify the code of `tryDividePair` by swapping the true and the false branches.	2025-02-17 09:37:29 +01:00
Simon Pilgrim	b16ce8fc24	[X86] getFauxShuffleMask - match 256-bit CONCAT(SUB0, SUB1) 64-bit elt patterns as well as 512-bit (#127392 ) The 512-bit filter was to prevent AVX1/2 regressions, but most of that is now handled by canonicalizeShuffleWithOp Ideally we need to support smaller element widths as well. Noticed while triaging #116931	2025-02-17 08:20:25 +00:00
Kazu Hirata	153dd19e30	[SelectionDAG] Remove lowerCallToExternalSymbol (#127408 ) The last use was removed in: commit 05e6bb40ebfd285cc87f7ce326b7ba76c3c7f870 Author: Roger Ferrer Ibáñez <rofirrim@gmail.com> Date: Thu May 30 14:55:32 2024 +0200	2025-02-17 00:06:48 -08:00
Kazu Hirata	86d82228a5	[dsymutil] Avoid repeated hash lookups (NFC) (#127449 )	2025-02-16 23:44:26 -08:00
Kazu Hirata	de06978ebc	[AMDGPU] Avoid repeated hash lookups (NFC) (#127445 )	2025-02-16 23:24:15 -08:00
Timm Baeder	36f8c8b438	[clang][bytecode] Fix rejecting non-constexpr array ctors (#127448 ) We shouldn't abort here when compiling, this is happening (and properly diagnosed) when interpreting the bytecode.	2025-02-17 08:06:12 +01:00
Kazu Hirata	02d4aac55c	[AMDGPU] Remove materializeImmediate (#127420 ) The lase use was removed in: commit cbf34a5f7701148d68951320a72f483849b22eaf Author: Juan Manuel Martinez Caamaño <jmartinezcaamao@gmail.com> Date: Fri Aug 23 14:06:17 2024 +0200	2025-02-16 22:47:14 -08:00
Timm Baeder	f1627e1a9e	[clang][bytecode][NFC] Move reduced libcxx tests to a subdir (#127438 )	2025-02-17 07:02:54 +01:00
Vikram Hegde	06a3abd9e8	[AMDGPU][NewPM] Port "SIFormMemoryClauses" to NPM (#127181 )	2025-02-17 11:07:17 +05:30
Timm Baeder	c3cae9d6fc	[clang][bytecode] Fix const-ness of local primitive temporary (#127405 ) This used to cause certain std::range tests in libc++ to be diagnosed as modifying a const-qualified field, because we set the IsConst flag to true unconditionally. Check the type instead.	2025-02-17 06:24:30 +01:00
Craig Topper	9b7282e545	[RISCV] Recognize de-interleave shuffles with 2 sources. (#127272 ) We can use vnsrl+trunc on each source and concatenate the results with vslideup. For low LMUL it would be better to concat first, but I'm leaving this for later.	2025-02-16 20:40:09 -08:00
Kazu Hirata	5d62a79bb7	[Serialization] Remove getMacroID (#127413 ) The last use was removed in: commit ee977933f7df9cef13cc06ac7fa3e4a22b72e41f Author: Richard Smith <richard-llvm@metafoo.co.uk> Date: Fri May 1 21:22:17 2015 +0000	2025-02-16 20:03:34 -08:00
Owen Pan	885382f437	[clang-format] Fix a bug in annotating braces (#127306 ) Fixes #107616.	2025-02-16 19:30:33 -08:00
Uday Bondhugula	69f3e003bf	[MLIR] NFC. Refactor IntegerRelation getSliceBounds (#127308 ) Refactor FlatLinearConstraints getSliceBounds. The method was too long and nested. NFC.	2025-02-17 08:58:06 +05:30
Craig Topper	9e8cd733c2	[Mips] Use MCRegister. NFC Use id() to get rid of some implicit conversions.	2025-02-16 19:10:49 -08:00
Longsheng Mou	ecb7f5aaee	[mlir][linalg] Update docs for `linalg.generic`(NFC) (#127178 ) The mixed tensor/buffer semantics has been disallowed in #80660. Closes #124090.	2025-02-17 09:29:56 +08:00
Florian Mayer	a7a02083ac	[flang] Assert the Options fit into the storage bits (#126169 )	2025-02-16 15:03:59 -08:00
Craig Topper	26fc2e90fc	[Mips] Use MCRegisterClass::getRegister() instead of begin()+RegNo. NFC	2025-02-16 14:00:51 -08:00
Craig Topper	d150101160	[Hexagon] Use MCRegister. NFC	2025-02-16 14:00:51 -08:00
Alex Richardson	01b7e65c91	[FreeBSD] Fix comparison in f75126eeabba13ce2aab53c2e4296fca12b9da0d We have to compare the string contents and not the const char* pointer. This happened to work in my testing but is not reliable.	2025-02-16 13:45:50 -08:00
dong-miao	7817045e5c	[RISCV] Support [mh]edelegh CSRs (#121634 ) These RV32-only CSRs are defined in privileged spec v1.13.	2025-02-16 13:41:46 -08:00
Alexander Richardson	f75126eeab	[FreeBSD] Support -stdlib=libstdc++ The experimental-library-flag.cpp test was failing on FreeBSD builders, which turned to be caused by missing support for -stdlib=libcstdc++ (and just using a hardcoded libc++ in all cases). Simplify FreeBSD::AddCXXStdlibLibArgs() by deferring to the parent class and dealing with the FreeSBD < 14 profiling support as a special case. While touching the test file also drop the unnecessary `-o %t.o`. This is not needed since the RUN lines use -### and don't produce any output. Reviewed By: DimitryAndric, MaskRay Pull Request: https://github.com/llvm/llvm-project/pull/126302	2025-02-16 12:32:51 -08:00
Florian Hahn	b4f91b007f	[LV] Use IRBuilder::insert to insert VPWidenRecipe (NFC).	2025-02-16 21:25:07 +01:00
Maksim Levental	6273877224	[lld] enable installing lld headers and libraries as part of distribution (#127123 ) This patch allows `lld-headers` and `lld-libraries` in `LLVM_DISTRIBUTION_COMPONENTS` to be specified and thus enable piecewise installation of `lld/*/.h` headers and/or lld libraries (both in shared and static builds). This is similar to use cases such as `clang;clang-headers;clang-libraries`. Note when `lld-libraries` is present, `llvm-libraries` must be present as well because various lld libraries depend on various llvm libraries.	2025-02-16 15:18:09 -05:00
Fangrui Song	c22d84f7bb	[ELF] Refine ctx.arg.exportDynamic condition --export-dynamic should be a no-op when ctx.hasDynsym is false. * Drop unneeded ctx.hasDynsym checks. * Static linking with --export-dynamic does not prevent devirtualization.	2025-02-16 12:12:00 -08:00
Florian Hahn	f5d63ccb22	[LICM] Add test with deref assumption of GEP.	2025-02-16 20:55:22 +01:00

1 2 3 4 5 ...

527569 Commits