llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-23 22:16:05 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	085bdb1e4c	[X86] canonicalizeShuffleWithOp - don't bother trying to move shuffles across binops to fold the load. (#126894 ) Its not currently used, but is likely to just introduce additional shuffles, resulting in higher Port5 pressure etc. in future patches.	2025-02-12 12:13:11 +00:00
Paul Walker	563d54569e	[NFC][LLVM][LangRef] Fix typos within partial.reduce.add documentation.	2025-02-12 11:51:26 +00:00
Jonathan Thackray	036f543952	[AArch64] Pre-commit tests for #125686 (NFC) (#126643 ) Update the `generate-tests.py` script to create new tests for `atomicrmw {fadd,fmin,fmax}` and test these with `half`, `float`, `bfloat` and `double`. Generate fp auto-tests to check both with and without `+lsfe`, so that when #125686 is merged, `+lsfe` will use a single atomic floating-point instruction.	2025-02-12 11:48:11 +00:00
Frank Schlimbach	0fd50ec9a3	[MLIR][mesh] Mesh fixes (#124724 ) A collection of fixes to the mesh dialect - allow constants in sharding propagation/spmdization - fixes to tensor replication (e.g. 0d tensors) - improved canonicalization - sharding propagation incorrectly generated too many ShardOps New operation `mesh.GetShardOp` enables exchanging sharding information (like on function boundaries)	2025-02-12 12:44:48 +01:00
Ivan Butygin	0e779ad499	Revert "[mlir] ArithToLLVM: fix memref bitcast lowering" (#126895 ) Reverts llvm/llvm-project#125148 bot failures	2025-02-12 14:34:16 +03:00
Paul Walker	01afa8fc0b	[NFC][LLVM][LangRef] Improve documentation for partial.reduce.add. (#126728 )	2025-02-12 11:33:24 +00:00
Ivan Butygin	79010e2e4d	[mlir] ArithToLLVM: fix memref bitcast lowering (#125148 ) `arith.bitcast` is allowed on memrefs and such code can actually be generated by IREE `ConvertBf16ArithToF32Pass`. `LLVM::detail::vectorOneToOneRewrite` doesn't properly check its types and will generate bitcast between structs which is illegal. With the opaque pointers this is a no-op operation for memref so we can just add type check in `LLVM::detail::vectorOneToOneRewrite` and add a separate pattern which removes op if converted types are the same.	2025-02-12 14:19:13 +03:00
David Green	bf7af2d12e	[AArch64][DAG] Allow fptos/ui.sat to scalarized. (#126799 ) We we previously running into problems with fp128 types and certain integer sizes. Fixes an issue reported on #124984	2025-02-12 11:04:08 +00:00
Donát Nagy	edbc1fb228	[analyzer] Add option assume-at-least-one-iteration (#125494 ) This commit adds the new analyzer option `assume-at-least-one-iteration`, which is `false` by default, but can be set to `true` to ensure that the analyzer always assumes at least one iteration in loops. In some situations this "loop is skipped" execution path is an important corner case that may evade the notice of the developer and hide significant bugs -- however, there are also many situations where it's guaranteed that at least one iteration will happen (e.g. some data structure is always nonempty), but the analyzer cannot realize this and will produce false positives when it assumes that the loop is skipped. This commit refactors some logic around the implementation of the new feature, but the only functional change is introducing the new analyzer option. If the new option is left in its default state (false), then the analysis is functionally equivalent to an analysis done with a version before this commit.	2025-02-12 11:56:02 +01:00
Mikhail Goncharov	d51750dba1	[bazel] port c03325cead2244ef0a89bb1cf365bddf16021daf	2025-02-12 11:33:49 +01:00
Mikhail Goncharov	5fe37ff75a	Revert "[NVPTX] Cleanup/Refactoring in NVPTX AsmPrinter and RegisterInfo (NFC) (#126800 )" This reverts commit 215fa9e175c6ef9e2fa92f77fbd4015cd4c99a67. getNameOrAsOperand is only defined under DEBUG	2025-02-12 11:13:16 +01:00
Simon Pilgrim	f73ed3d434	[X86] lowerShuffleAsBroadcast - use isShuffleEquivalent to search for a hidden broadcast pattern (#126517 ) lowerShuffleAsBroadcast only matches a known-splat shuffle mask, but we can use the isShuffleEquivalent/IsElementEquivalent helpers to attempt to find a hidden broadcast-able shuffle pattern. This requires an extension to IsElementEquivalent to peek through bitcasts to match against wider shuffles - these typically appear during shuffle lowering where we've widened a preceding shuffle, often to a vector concatenation etc. Amazingly I hit this while yak shaving #126033 .......	2025-02-12 10:12:20 +00:00
Kareem Ergawy	32faf43878	[flang][OpenMP] Handle fixed length `charater`s in delayed privatization (#126704 ) We currently handle sequences of fixed-length arrays properly by not emitting length parameters for `embox` ops inside the `omp.private` op. However, we do not handle the scalar case. This PR extends `getLengthParameters` defined in `PrivateReductionUtils.cpp` to handle such cases. Fixes issue reported in #125732.	2025-02-12 11:04:26 +01:00
Pavel Labath	37f36cbffb	[lldb] Support disassembling discontinuous functions (#126505 ) The command already supported disassembling multiple ranges, among other reasons because inline functions can be discontinuous. The main thing that was missing was being able to retrieve the function ranges from the top level function object. The output of the command for the case where the function entry point is not its lowest address is somewhat confusing (we're showing negative offsets), but it is correct.	2025-02-12 10:47:22 +01:00
Nikita Popov	73413bd6a3	[mlir] Add missing dependency After #126745, we should also depend on the Analysis component.	2025-02-12 10:26:41 +01:00
Yeaseen	c174cc4840	[llvm] Remove `br i1 undef` in some `llvm/test/CodeGen` tests (#126811 ) This PR replaces some instances of `br i1 undef` with function argument value in several tests under `llvm/test/CodeGen/` directory.	2025-02-12 09:19:00 +00:00
Nikita Popov	c03325cead	[MLIR][LLVMIR] Use TargetFolder when creating globals (#126745 ) The LLVM dialect lowers globals using IRBuilder, relying on it creating constant expressions where possible. As we remove support for more constant expressions (per https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179), this can cause issues for cases where the constant expression is no longer supported, and the operation cannot be constant folded without DataLayout being available. In particular, I ran into this issue with flang and the removal of mul constant expressions. Address this by using TargetFolder when creating globals, which will perform DL-aware constant folding. I think it would make sense to also do this in general, but I'm starting with globals where not doing this can result in translation failures. Ideally, globals with these problematic expressions would never be generated in the first place, but there has been little movement on fixing this (https://github.com/llvm/llvm-project/issues/96047).	2025-02-12 10:14:00 +01:00
Simon Pilgrim	8359dbc8c0	[X86] combineEXTRACT_SUBVECTOR - fold extract_subvector(subv_broadcast_load(ptr),0) -> load(ptr) (#126523 ) This is typically handled by SimplifyDemandedVectorElts, but this will fail when there are multiple uses of the subv_broadcast_load node, but if there's just one use of the load result (and the rest are uses of the memory chain), we can still replace with a load and update the chain accordingly. Noticed on #126517	2025-02-12 09:07:06 +00:00
Timm Baeder	20506a0a15	[clang][bytecode] Fix operator new source expression (#126870 ) ... for composite element types. Looks like I forgot this in e6030d389571b3f1b0f0c5a35b7fa45937ed0f6c	2025-02-12 10:05:49 +01:00
Fraser Cormack	25c0554166	[libclc] Move conversion builtins to the CLC library (#124727 ) This commit moves the implementations of conversion builtins to the CLC library. It keeps the dichotomy of regular vs. clspv implementations of the conversions. However, for the sake of a consistent interface all CLC conversion routines are built, even the ones that clspv opts out of in the user-facing OpenCL layer. It simultaneously updates the python script to use f-strings for formatting.	2025-02-12 08:55:02 +00:00
jeanPerier	65075a863b	[flang][FIR] handle argument attributes in fir.call (#126711 ) Add pretty printer/parser for fir.call argument/result attributes and propagate them to llvm.call. This will allow implementing the TODO about ABI relevant argument attribute in indirect calls.	2025-02-12 09:49:52 +01:00
Louis Dionne	39f0f0a21b	[libc++] Remove obsolete guards for join_view being experimental (#126697 ) These TODOs were forgotten when join_view was made non-experimental. By removing these checks, we slightly increase the coverage of the test suite.	2025-02-12 09:44:49 +01:00
Nikita Popov	0abe058d7f	[BOLT] Use getMainExecutable() (#126698 ) Use LLVM's getMainExecutable() helper instead of rolling our own. This will result in standard behavior across platforms, such as making sure that symlinks are always resolved.	2025-02-12 09:44:26 +01:00
Alex MacLean	215fa9e175	[NVPTX] Cleanup/Refactoring in NVPTX AsmPrinter and RegisterInfo (NFC) (#126800 )	2025-02-12 00:23:36 -08:00
Adam Siemieniuk	0b9b014be7	[mlir][dlti] Query by strings (#126716 ) Adds DLTI utility to query using strings directly as keys.	2025-02-12 09:13:43 +01:00
Amit Kumar Pandey	46f1bab793	Reapply "[Driver][ROCm][OpenMP] Fix default ockl linking for OpenMP."… (#126671 ) - This reverts commit `0c6c4a9993`. - Add '-mcode-object-version=5' as to explicitly use code object version 5 to match with 'FAIL' diagnostic. - Add Requires directive to support lit test run on platforms registered with x86_64 and amdgpu.	2025-02-12 13:40:51 +05:30
Craig Topper	7dd82805d5	[SelectionDAGBuilder] Remove NodeMap updates from getValueImpl. NFC (#126849 ) Both callers already put the result in NodeMap immediately after the call.	2025-02-12 00:07:07 -08:00
Owen Pan	3ca9238cb0	[clang-format][NFC] Fix test case format	2025-02-11 23:58:53 -08:00
Vitaly Buka	be98428374	[NFC][Pipelines] Extract buildCoroConditionalWrapper (#126860 ) Helper for #126168. `Phase` will be used in followup patches.	2025-02-11 23:54:07 -08:00
Matt Arsenault	de968c8e1c	AMDGPU: Use range to implement getSubRegs (#126861 ) Fixes #126781	2025-02-12 14:05:42 +07:00
Ethan Luis McDonough	52ee06d273	[PGO][Offload] Fix pgo1.c (#126864 ) pgo1.c had outdated test checks	2025-02-12 00:54:31 -06:00
Haohai Wen	ec28e9b757	[MC] Replace MCContext::GenericSectionID with MCSection::NonUniqueID (#126202 ) They have same semantics. NonUniqueID is more friendly for isUnique implementation in MCSectionELF. History: 97837b7 added support for unique IDs in sections and added GenericSectionID. Later, 1dc16c7 added NonUniqueID.	2025-02-12 14:28:37 +08:00
Sam Elliott	d222488007	[AsmParser] Remove OperandMatchResultTy (#126650 ) This has been deprecated since a479be0f39a3301e9ca634d37cf6454b6d3865c6 from September 2023, before LLVM 18. Surely now enough release cycles have happened that it can be removed upstream.	2025-02-11 21:59:05 -08:00
Vikram Hegde	9c725ef368	[AMDGPU][NewPM] Port "GCNRewritePartialRegUses" pass to NPM (#126024 )	2025-02-12 11:21:40 +05:30
Ethan Luis McDonough	9e5c136d5a	[PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (#93365 ) This pull request is the second part of an ongoing effort to extends PGO instrumentation to GPU device code and depends on #76587. This PR makes the following changes: - Introduces `__llvm_write_custom_profile` to PGO compiler-rt library. This is an external function that can be used to write profiles with custom data to target-specific files. - Adds `__llvm_write_custom_profile` as weak symbol to libomptarget so that it can write the collected data to a profraw file. - Adds `PGODump` debug flag and only displays dump when the aforementioned flag is set	2025-02-11 23:30:54 -06:00
Kazu Hirata	84e3c6ff95	[RISCV] Fix a warning THis patch fixes: llvm/lib/Target/RISCV/RISCVVMV0Elimination.cpp:91:29: error: unused variable 'TRI' [-Werror,-Wunused-variable]	2025-02-11 20:37:48 -08:00
Abhishek Kaushik	df2dca7a73	[MC] Use `std::move` to avoid copy (#126700 )	2025-02-12 10:01:30 +05:30
Jim Lin	31bfae35d2	[DAGCombiner] Add hasOneUse checks for folding (not (add X, -1)) to (neg X) (#126667 ) To get more better codegen for AArch with bic, x86 with andn and riscv with andn.	2025-02-12 12:24:29 +08:00
LLVM GN Syncbot	caa9fae2e7	[gn build] Port cc7e83601d75	2025-02-12 04:14:13 +00:00
Hongtao Yu	4a63ff4330	Revert "[mlir] Enable LICM for ops with only read side effects in scf.for" (#126840 ) Reverts llvm/llvm-project#120302	2025-02-11 20:07:21 -08:00
Luke Lau	cc7e83601d	[RISCV] Select mask operands as virtual registers and eliminate uses of vmv0 (#125026 ) This is another attempt at #88496 to keep mask operands in SSA after instruction selection. Previously we selected the mask operands into vmv0, a singleton register class with exactly one register, V0. But the register allocator doesn't really support singleton register classes and we ran into errors like "ran out of registers during register allocation in function". This avoids this by introducing a pass just before register allocation that converts any use of vmv0 to a copy to $v0, i.e. what isel currently does today. That way the register allocator doesn't need to deal with the singleton register class, but we get the benefits of having the mask registers in SSA throughout the backend: - This allows RISCVVLOptimizer to reduce the VLs of instructions that define mask registers - It enables CSE and code sinking in more places - It removes the need to peek through mask copies in RISCVISelDAGToDAG and keep track of V0 defs in RISCVVectorPeephole This patch initially eliminates uses of vmv0s after RISCVVectorPeephole to keep the diff to a minimum, and a follow up patch will move it past the other MachineInstr SSA passes. Note that it doesn't try to remove any defs of vmv0 as we shouldn't have any instructions that have any vmv0 outputs. As a further follow up, we can move the elimination pass to after phi elimination and outside of SSA, which would unblock the pre-RA scheduler around masked pseudos. This might also help the issue that RISCVVectorMaskDAGMutation tries to solve.	2025-02-12 12:06:55 +08:00
Miguel A. Arroyo	acd34d90d3	[Clang][CMake][MSVC] Install PDBs alongside executables (#126675 ) * Follows up on https://github.com/llvm/llvm-project/pull/120683 enabling PDBs for `clang`.	2025-02-11 19:41:34 -08:00
Jie Fu	a0fbc19ad6	[MemorySanitizer] Silence an unused-variable warning (NFC) /llvm-project/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp:2622:22: error: unused variable 'ReturnType' [-Werror,-Wunused-variable] FixedVectorType *ReturnType = cast<FixedVectorType>(I.getType()); ^ 1 error generated.	2025-02-12 11:32:51 +08:00
Thurston Dang	bfbe5319a8	[msan] Add handlePairwiseShadowOrIntrinsic and use it to handle Arm NEON pairwise add (#126008 ) This patch adds a function, handlePairwiseShadowOrIntrinsic that ORs pairs of adjacent shadow values; this is suitable for propagating shadow for 1- or 2-vector intrinsics that combine adjacent fields. It then applies handlePairwiseShadowOrIntrinsic to Arm NEON pairwise add: llvm.aarch64.neon.{addhn, raddhn} (currently incorrectly handled) and llvm.aarch64.neon.{saddlp, uaddlp} (currently suboptimally handled). Updates the tests from https://github.com/llvm/llvm-project/pull/125820.	2025-02-11 19:13:18 -08:00
c8ef	6de4de8931	[libc] implement endian related macros (#126368 ) Follow up of #125168. This patch adds endian-related macros to `endian.h`. We utilize compiler built-ins for byte swap functions, which are already included in our minimal supported compiler version.	2025-02-12 10:17:09 +08:00
lonely eagle	82cbb02cbc	[mlir][vector][NFC] Fix typos in tests (#126662 ) [mlir][vector] Fix typos in tests (nfc) Fix typos in `{insert\|extract}_scalar_from_vec_2d_f32_dynamic_idxs_compile_time_constant` - the intention was to use `f32` rather than `i32`.	2025-02-12 10:05:32 +08:00
donald chen	f15a6c99fa	[mlir] [DataFlow] Fix bug in int-range-analysis (#126708 ) When querying the lower bound and upper bound of loop to update the value range of a loop iteration variable, the program point to depend on should be the block corresponding to the iteration variable rather than the loop operation.	2025-02-12 09:58:56 +08:00
Christopher Ferris	9db0f91ceb	[scudo] Modify header corrupption error message (#126812 ) Update the error message to be explicit that this is likely due to memory corruption. In addition, check if the chunk header is all zero, which could mean corruption or an attempt to free a pointer after the memory has been released to the kernel. This case results in a slightly different error message to also indicate this could still be a double free.	2025-02-11 17:41:15 -08:00
Krzysztof Drewniak	934c97dd16	[LowerBufferFatPointers] Fix support for GEP T, p7, <N x T> idxs (#126126 ) The lowering for GEP didn't properly support the case where the pointer argument was being implicitly broadcast by a vector of indices. Fix that. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-02-11 18:22:50 -06:00
Michael Jones	574ccc6d1b	[libc] fix get_epoch constexpr error (#126818 ) get_epoch calls mktime_internal which isn't constexpr. For now, just remove the constexpr from get_epoch.	2025-02-11 16:08:20 -08:00

... 3 4 5 6 7 ...

527289 Commits