llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 23:36:35 +00:00

Author	SHA1	Message	Date
Theo de Magalhaes	76fa9530c9	[clang] add support for -Wpadded on Windows (#130182 ) Implements the -Wpadded warning for --target=x86_64-windows-msvc etc. Fixes #61702 .	2025-04-02 14:46:58 -07:00
Florian Hahn	380defd4b3	[VPlan] Update VPInterleaveRecipe to take debug loc directly as arg (NFC)	2025-04-02 22:46:38 +01:00
Chris B	81601cf3ab	[Docs] Clarify that `reassoc` isn't just for reassociation (#133168 ) The `reassoc` fast-math flag allows a much wider array of algebraic transformations than just strictly reassociations. In some cases it does commutations, distributions, and folds away redundant inverse operations... While it might make sense to fix the flag naming at some point, in the meantime we should at least have the docs be accurate to avoid confusion.	2025-04-02 16:43:10 -05:00
Nirvedh Meshram	42b3f91fd6	[mlir] Vectorize tensor.pad with low padding for unit dims (#133808 ) We currently do not have masked vectorization support for tenor.pad with low padding. However, we can allow this in the special case where the result dimension after padding is a unit dim. The reason is when we actually have a low pad on a unit dim, the input size of that dimension will be (or should be for correct IR) dynamically zero and hence we will create a zero mask which is correct. If the low pad is dynamically zero then the lowering is correct as well. --------- Signed-off-by: Nirvedh <nirvedh@gmail.com>	2025-04-02 16:32:36 -05:00
Valentin Clement (バレンタインクレメン)	db21ae7803	[flang][cuda] Support any_sync and ballot_sync (#134135 )	2025-04-02 14:26:09 -07:00
Brox Chen	066787b9bd	[AMDGPU][True16][CodeGen] fold clamp update for true16 (#128919 ) Check through COPY for possible clamp folding for v_mad_mixhi_f16 isel	2025-04-02 17:10:53 -04:00
Craig Topper	38937ac24c	[RISCV] Check line and column for errors in rv(32/64)zcmp-invalid.s. NFC Same for the Xqccmp version.	2025-04-02 14:00:08 -07:00
Florian Hahn	4b67c53e20	[VPlan] Use recipe debug loc instead of instr DLs in more cases (NFC) Update both VPInterleaveRecipe and VPReplicateRecipe codegen to use debug location directly from the recipe, not the underlying instruction. This removes another dependency on underlying instructions.	2025-04-02 21:51:17 +01:00
vporpo	a1b0b4997e	[SandboxVec][NFC] Replace std::regex with llvm::Regex (#134110 )	2025-04-02 13:46:56 -07:00
George Burgess IV	a8585654c2	[llvm][utils] skip revert-checking reverts across branches (#134108 ) e2ba1b6ffde4ec607342b1b746d1b57f0f04390a references that it reverts a commit that's not a parent of e2ba1b6ffde4ec607342b1b746d1b57f0f04390a. Functionally, this can (and demonstrably does) work(), but from the standpoint of the revert checker, it's nonsense. Print a `logging.error` when it's detected. Tested by running the revert checker against a commit range that includes the aforementioned commit; the logging.error was fired appropriately. () - the specifics here are: - the _SHA_ that was referenced was on a non-main branch, but - the commit from the non-main branch was merged into the non-main branch from main - ...so the _functional_ commit being reverted was originally landed on main, but the _SHA_ referenced from main was from a branch that was cut before the reverted-commit was landed on main	2025-04-02 13:44:18 -07:00
Abhinav Kumar	51d1c72886	[libc] Added support for fixed-points in ``is_signed` `and` `is_unsigned``. (#133371 ) Fixes #133365 ## Changes Done - Changed the signed checking to ```cpp struct is_signed : bool_constant<((is_fixed_point<T> \|\| is_arithmetic_v<T>) && (T(-1) < T(0)))> ``` in ``/libc/src/__support/CPP/type_traits/is_signed.h``. Added check for fixed-points. - But, got to know that this will fail for ``unsigned _Fract`` or any unsigned fixed-point because ``unsigned _Fract`` can’t represent -1 in T(-1), while ``unsigned int`` can handle it via wrapping. - That's why I explicity added ``is_signed`` check for ``unsigned`` fixed-points. - Same changes to ``/libc/src/__support/CPP/type_traits/is_unsigned.h``. - Added tests for ``is_signed`` and ``is_unsigned``.	2025-04-02 16:41:47 -04:00
Krzysztof Drewniak	554859c736	[TTI] Make isLegalMasked{Load,Store} take an address space (#134006 ) In order to facilitate targets that only support masked loads/stores on certain address spaces (AMDGPU will support them in an upcoming patch, but only for address space 7), add an AddressSpace parameter to isLegalMaskedLoad and isLegalMaskedStore	2025-04-02 15:38:10 -05:00
erichkeane	bb8a7a7349	[OpenACC] Implement 'pqr-list' has at least one 1 item. OpenACC Github PR#499 defines the pqr-list as having at least 1 item. We already handle that for all but 'wait', so this patch just does the work to add it for 'wait', plus adds tests.	2025-04-02 13:33:18 -07:00
Andrzej Warzyński	2bee24632f	[mlir][bugfix] Fix erroneous condition in `getEffectsOnResource` (#133638 ) This patch corrects an invalid condition in `getEffectsOnResource` used to identify relevant "resources": ```cpp return it.getResource() != resource; ``` The current implementation assumes that only one instance of each resource will exist, so comparing raw pointers is both safe and sufficient. This assumption stems from constructs like: ```cpp static DerivedResource get() { static DerivedResource instance; return &instance; } ``` i.e., resource instances returned via static singleton methods. However, as discussed in https://github.com/llvm/llvm-project/issues/129216, this assumption breaks in practice — notably on macOS (Apple Silicon) when built with: * `-DBUILD_SHARED_LIBS=On`. In such cases, multiple instances of the same logical resource may exist across shared library boundaries, leading to incorrect behavior and causing failures in tests like: * test/Dialect/Transform/check-use-after-free.mlir This patch replaces the pointer comparison with a comparison based on resource identity: ```cpp return it.getResource()->getResourceID() != resource->getResourceID(); ``` This approach aligns better with the intent of `getEffectsOnResource`, which is to: ```cpp /// Collect all of the effect instances that operate on the provided /// resource (...) ``` Fixes #129216	2025-04-02 21:26:41 +01:00
Luke Lau	df9e5ae5b4	[InstCombine] Match scalable splats in m_ImmConstant (#132522 ) #118806 fixed an infinite loop in FoldShiftByConstant that could occur when the shift amount was a ConstantExpr. However this meant that FoldShiftByConstant no longer kicked in for scalable vectors because scalable splats are represented by ConstantExprs. This fixes it by allowing scalable splats of non-ConstantExprs in m_ImmConstant, which also fixes a few other test cases where scalable splats were being missed. But I'm also hoping that UseConstantIntForScalableSplat will eventually remove the need for this. I noticed this when trying to reverse a combine on RISC-V in #132245, and saw that the resulting vector and scalar forms were different. --------- Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>	2025-04-02 21:21:52 +01:00
Krzysztof Parzyszek	564e04b703	[flang][OpenMP] Use function symbol on DECLARE TARGET (#134107 ) Consider: ``` function foo() !$omp declare target(foo) ! This `foo` was a function-result symbol ... end ``` When resolving symbols, for this case use the symbol corresponding to the function instead of the symbol corresponding to the function result. Currently, this will result in an error: ``` error: A variable that appears in a DECLARE TARGET directive must be declared in the scope of a module or have the SAVE attribute, either explicitly or implicitly ```	2025-04-02 15:16:33 -05:00
David Peixotto	2026873fb8	Add enable/disable api for SystemRuntime plugins (#133794 ) This commit adds support for enabling and disabling plugins by name. The changes are made generically in the `PluginInstances` class, but currently we only expose the ability to SystemRuntime plugins. Other plugins types can be added easily. We had a few design goals for how disabled plugins should work 1. Plugins that are disabled should still be visible to the system. This allows us to dynamically enable and disable plugins and report their state to the user. 2. Plugin order should be stable across disable and enable changes. We want avoid changing the order of plugin lookup. When a plugin is re-enabled it should return to its original slot in the creation order. 3. Disabled plugins should not appear in PluginManager operations. Clients should be able to assume that only enabled plugins will be returned from the PluginManager. For the implementation we modify the plugin instance to maintain a bool of its enabled state. Existing clients external to the Instances class expect to iterate over only enabled instance so we skip over disabed instances in the query and snapshot apis. This way the client does not have to manually check which instances are enabled.	2025-04-02 13:15:31 -07:00
Nikolas Klauser	c59d3a2684	[libc++] Add visibility annotations to the std namespace with GCC (#133233 ) This allows us to remove the need for `_LIBCPP_TEMPLATE_VIS` and fixes a bunch of missing annotations for RTTI when used across dylib boundaries. `_LIBCPP_TEMPLATE_VIS` itself will be removed in a separate patch, since it touches a lot of code. This patch is a no-op for Clang. Only GCC is affected.	2025-04-02 22:12:59 +02:00
Brox Chen	fb0e7b5f16	[AMDGPU][True16][CodeGen] Implement sgpr folding in true16 (#128929 ) We haven't implemented 16 bit SGPRs. Currently allow 32-bit SGPRs to be folded into True16 bit instructions taking 16 bit values. Also use sgpr_32 when Imm is copied to spgr_lo16 so it could be further folded. This improves generated code quality.	2025-04-02 16:08:26 -04:00
ofri frishman	6f1347d57b	[MLIR] Bubble up tensor.extract_slice through tensor.collapse_shape (#131982 ) Add a pattern that bubbles up tensor.extract_slice through tensor.collapse_shape. The pattern is registered in a pattern population function that is used by the transform op transform.apply_patterns.tensor.bubble_up_extract_slice and by the tranform op transform.structured.fuse as a cleanup pattern. This pattern enables tiling and fusing op chains which contain tensor.collapse_shape if added as a cleanup pattern of tile and fuse utility. Without this pattern that would not be possible, as tensor.collapse_shape does not implement the tiling interface. This is an additional pattern to the one added in PR #126898	2025-04-02 21:06:43 +01:00
Jonas Devlieghere	c87dc2b7d4	[lldb-dap] Speed up TestDAP_Progress (#134048 ) While trying to make progress on #133782, I noticed that TestDAP_Progress was taking 90 seconds to complete. This patch brings that down to 10 seocnds by making the following changes: 1. Don't call `wait_for_event` with a 15 second timeout. By the time we call this, all progress events have been emitted, which means that we're just sitting there until we hit the timeout. 2. Don't use 10 steps (= 10 seconds) for indeterminate progress. We have two indeterminate progress tests so that's 6 seconds instead of 20. 3. Don't launch the process over and over. Once we have a dap session, we can clear the progress vector and emit new progress events.	2025-04-02 12:41:47 -07:00
Florian Hahn	3bdf9a0880	[EquivalenceClasses] Use SmallVector for deterministic iteration order. (#134075 ) Currently iterators over EquivalenceClasses will iterate over std::set, which guarantees the order specified by the comperator. Unfortunately in many cases, EquivalenceClasses are used with pointers, so iterating over std::set of pointers will not be deterministic across runs. There are multiple places that explicitly try to sort the equivalence classes before using them to try to get a deterministic order (LowerTypeTests, SplitModule), but there are others that do not at the moment and this can result at least in non-determinstic value naming in Float2Int. This patch updates EquivalenceClasses to keep track of all members via a extra SmallVector and removes code from LowerTypeTests and SplitModule to sort the classes before processing. Overall it looks like compile-time slightly decreases in most cases, but close to noise: https://llvm-compile-time-tracker.com/compare.php?from=7d441d9892295a6eb8aaf481e1715f039f6f224f&to=b0c2ac67a88d3ef86987e2f82115ea0170675a17&stat=instructions PR: https://github.com/llvm/llvm-project/pull/134075	2025-04-02 20:27:43 +01:00
Sarah Spall	60efed3f20	[HLSL] Update __builtin_hlsl_dot builtin Sema Checking to fix error when passed an array literal 1u.xxxx (#133941 ) update dot builtin sema checking and codegen new test fix tests Closes #133659	2025-04-02 12:27:01 -07:00
Alexey Bataev	843ef77dc2	[SLP]Update mapping between values and their matching entries upon selection Need to update the mapping between gathered values and their matching entries, if the list of the entries is updated and only some of them are selected for final shuffling. Fixes #134085	2025-04-02 11:59:32 -07:00
Amy Huang	f475ccd379	Fix to the libc BUILD.bazel file after changing atan_utils.h deps. (#134128 ) Additional fix for libc BUILD.bazel after commit 8741412 (#133980) This seems to match libc/src/math/generic/CMakeLists.txt.	2025-04-02 11:05:36 -07:00
Arvind Sudarsanam	32dff27060	[clang-sycl-linker] Fix flaky failure and add REQUIRES (Try #2 ) (#134130 ) This should fix failures caused by https://github.com/llvm/llvm-project/pull/133967 Attn: @sarnex Thanks Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>	2025-04-02 18:05:17 +00:00
Juan Manuel Martinez Caamaño	beae0e9f1a	[AMDGPU] Use a target feature to enable __builtin_amdgcn_global_load_lds on gfx9/10 (#133055 ) This patch introduces the `vmem-to-lds-load-insts` target feature, which can be used to enable builtins `__builtin_amdgcn_global_load_lds` and `__builtin_amdgcn_raw_ptr_buffer_load_lds` on platforms which have this feature. This feature is only available on gfx9/10. A limitation of using a common target feature for both builtins is that we could have made `__builtin_amdgcn_raw_ptr_buffer_load_lds` available on gfx6,7,8.	2025-04-02 20:00:09 +02:00
Juan Manuel Martinez Caamaño	0375ef07c3	[Clang][AMDGPU] Add __builtin_amdgcn_cvt_off_f32_i4 (#133741 ) This built-in maps to `V_CVT_OFF_F32_I4` which treats its input as a 4-bit signed integer and returns `0.0625f * src`. SWDEV-518861	2025-04-02 19:51:40 +02:00
Nick Sarnie	540dd89778	Revert "[clang-sycl-linker] Fix flaky failure and add REQUIRES" (#134127 ) Reverts llvm/llvm-project#134125	2025-04-02 17:44:23 +00:00
Jorge Gorbe Moya	a57023b6a0	Add missing include for llvm::Error after 74ec038ffb34575ee93fa313cd0ea0db0c0a7e0a	2025-04-02 10:43:05 -07:00
Arvind Sudarsanam	4688719cf5	[clang-sycl-linker] Fix flaky failure and add REQUIRES (#134125 ) This should fix failures caused by https://github.com/llvm/llvm-project/pull/133967 Attn: @sarnex Thanks Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>	2025-04-02 17:38:11 +00:00
Adrian Prantl	a3ac318e5f	[lldb] Skip test with older version of clang	2025-04-02 10:31:44 -07:00
Kazu Hirata	aa33c09561	[flang] Fix a warning This patch fixes: flang/lib/Optimizer/OpenMP/DoConcurrentConversion.cpp:184:18: error: unused variable 'loc' [-Werror,-Wunused-variable]	2025-04-02 10:14:50 -07:00
Snehasish Kumar	c18994c7cd	[Metadata] Preserve MD_prof when merging instructions when one is missing. (#132433 ) Preserve branch weight metadata when merging instructions if one of the instructions is missing metadata. This is similar in behaviour to what we do today for other types of metadata such as mmra, memprof and callsite metadata.	2025-04-02 11:13:45 -06:00
Snehasish Kumar	dde0be9d97	[Metadata] Handle memprof, callsite merging when one is missing. (#132106 ) For memprof and callsite metadata we want to pick one deterministically and keep that even if one of them may be missing.	2025-04-02 11:10:02 -06:00
erichkeane	d7724c8ea3	[OpenACC] allow 'if' clause on 'atomic' construct This was added in OpenACC PR #511 in the 3.4 branch. From an AST/Sema perspective this is pretty trivial as the infrastructure for 'if' already exists, however the atomic construct needed to be taught to take clauses. This patch does that and adds some testing to do so.	2025-04-02 10:03:24 -07:00
Fangrui Song	b6e2df54c4	[MC] Move some member variables from AsmParser to MCAsmParser to eliminate some virtual functions and avoid duplication between AsmParser/MasmParser.	2025-04-02 09:59:18 -07:00
Luke Lau	711b15d179	[RISCV] Mark subvector extracts from index 0 as cheap (#134101 ) Previously we only marked fixed length vector extracts as cheap, so this extends it to any extract at index 0 which should just be a subreg extract. This allows extracts of i1 vectors to be considered for DAG combines, but also scalable vectors too. This causes some slight improvements with large legalized fixed-length vectors, but the underlying motiviation for this is to actually prevent an unprofitable DAG combine on a scalable vector in an upcoming patch.	2025-04-02 17:57:13 +01:00
Ryan Buchner	fa2a6d68c6	[CodeGenPrepare][RISCV] Combine (X ^ Y) and (X == Y) where appropriate (#130922 ) Fixes #130510. In RISCV, modify the folding of (X ^ Y == 0) -> (X == Y) to account for cases where the (X ^ Y) will be re-used. If a constant is being used for the XOR before a branch, ensure that it is small enough to fit within a 12-bit immediate field. Otherwise, the equality check is more efficient than the check against 0, see the following: ``` # %bb.0: lui a1, 5 addiw a1, a1, 1365 xor a0, a0, a1 beqz a0, .LBB0_2 # %bb.1: ret .LBB0_2: ``` ``` # %bb.0: lui a1, 5 addiw a1, a1, 1365 beq a0, a1, .LBB0_2 # %bb.1: xor a0, a0, a1 ret .LBB0_2: ``` Similarly, if the XOR is between 1 and a size one integer, we should still fold away the XOR since that comparison can be optimized as a comparison against 0. ``` # %bb.0: slt a0, a0, a1 xor a0, a0, 1 beqz a0, .LBB0_2 # %bb.1: ret .LBB0_2: ``` ``` # %bb.0: slt a0, a0, a1 bnez a0, .LBB0_2 # %bb.1: xor a0, a0, 1 ret .LBB0_2: ``` One question about my code is that I used a hard-coded value for the width of a RISCV ALU immediate. Do you know of a way that I can gather this from the `context`, I was unable to devise one.	2025-04-02 09:56:09 -07:00
Nikita Popov	74ec038ffb	[OMPIRBuilder] Don't include MemorySSAUpdater.h (NFC) This header does not use MemorySSA in any way -- it was just using a typedef from it. Write out the type instead.	2025-04-02 18:48:51 +02:00
AdityaK	340f06a8d4	Fix: bail out when divisor is zero (#133518 ) Fixes: #131279	2025-04-02 09:44:18 -07:00
Matt Arsenault	efca37fda5	llvm-reduce: Change exit code for uninteresting inputs (#134021 ) This makes it easier to reduce llvm-reduce with llvm-reduce to filter cases where the input reduced too much. Not sure if it's possible to test the exit code in lit.	2025-04-02 23:43:06 +07:00
Fraser Cormack	ddc48fefe3	[libclc] Move native_(exp10\|powr\|tan) to CLC library (#134080 ) These are the three remaining native builtins not yet ported. There are elementwise versions of exp10 and tan which correspond to the intrinsics, which may be preferable to the current versions which route through other native builtins. Those could be changed in a follow-up if desired.	2025-04-02 17:37:17 +01:00
Arvind Sudarsanam	4a4d41e723	[SYCL][SPIR-V Backend][clang-sycl-linker] Add SPIR-V backend support inside clang-sycl-linker (#133967 ) This PR does the following: 1. Use SPIR-V backend to do LLVM to SPIR-V translation inside clang-sycl-linker 2. Remove llvm-spirv translator from clang-sycl-linker Currently, no SPIR-V extensions are enabled for SYCL compilation flow. This will be updated in subsequent commits. Thanks Note: This is one of the many PRs being introduced to add SYCL programming model support to LLVM ([RFC](https://discourse.llvm.org/t/rfc-add-sycl-programming-model-support/50812)). --------- Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>	2025-04-02 16:29:41 +00:00
Fehr Mathieu	8b67f36258	[mlir] [arith] Fix ceildivsi lowering in arith-expand (#133774 ) This fixes the current lowering of `arith.ceildivsi` in the arith-expand pass, which was previously incorrect. The new version is based on the lowering of `arith.floordivsi`, and will not introduce new undefined behavior or poison during the lowering. It also replaces one division with a multiplication. The previous lowering of `ceildivsi(n, m)` was the following: ``` x = (m > 0) ? -1 : 1 (nm>0) ? ((n+x) / m) + 1 : - (-n / m) ``` This caused two problems: In the case where `n` is INT_MIN and `m` is positive, the result would be poison instead of an actual value * In the case where `n` is INT_MAX and `m` is `-1`, this would trigger undefined behavior, while the original code wouldn't. This is because `n+x` would be equal to `INT_MIN` (`INT_MAX + 1`), so the `(n+x) / m` division would overflow and trigger UB.	2025-04-02 17:26:58 +01:00
Daniel Thornburgh	e84b57dfbf	[LLD][ELF] Support OVERLAY NOCROSSREFS (#133807 ) This allows NOCROSSREFS to be specified in OVERLAY linker script descriptions. This is a particularly useful part of the OVERLAY syntax, since it's very rarely possible for one overlay section to sensibly reference another. Closes #128790	2025-04-02 09:25:18 -07:00
Jonas Devlieghere	6f959a46c0	[lldb] uint8_t -> unsigned short in std::independent_bits_engine According to [1], the template parameter must be cv-unqualified and one of unsigned short, unsigned int, unsigned long, or unsigned long long. Should fix the following MSVC error: error: static assertion failed due to requirement '_Is_any_of_v<unsigned char, unsigned short, unsigned int, unsigned long, unsigned long long>': invalid template argument for independent_bits_engine: N4659 [1] https://en.cppreference.com/w/cpp/numeric/random/independent_bits_engine	2025-04-02 09:23:35 -07:00
Ramkumar Ramachandra	00122bb56b	[LV] Regen a test with UTC (#134076 )	2025-04-02 17:23:00 +01:00
Ramkumar Ramachandra	f7591ee161	[LV] Exercise type-mismatch with RT-check conflict rdx (#130295 ) The test suite of LoopVectorize suffers from a coverage hole when types mismatch, and runtime checks are needed, with a conflict redux. Fix this coverage hole by adding tests.	2025-04-02 17:22:40 +01:00
Shilei Tian	84cb08e118	[MLIR][AMDGPU] Bump to COV6 (#133849 ) We already bump to COV6 by default in the front-end and backend. This PR is for MLIR. Note that COV6 requires ROCm 6.3+.	2025-04-02 12:14:24 -04:00

1 2 3 4 5 ...

532831 Commits