llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 01:36:05 +00:00

Author	SHA1	Message	Date
Aaron Ballman	9cf46fb230	[C2y] Add octal prefixes, deprecate unprefixed octals (#131626 ) WG14 N3353 added support for 0o and 0O as octal literal prefixes. It also deprecates use of octal literals without a prefix, except for the literal 0. This feature is being exposed as an extension in older C language modes as well as in all C++ language modes.	2025-03-18 07:28:59 -04:00
Simon Pilgrim	31e98c7037	[CostModel][X86] merge abs costs tests using -cost-kind=all (#131619 ) Now that we have #130490 - merge the cost test files to avoid bitrot Lots more set of files to do - but this is give an example	2025-03-18 11:19:05 +00:00
quic_hchandel	5d53a88416	[RISCV] Change RISCVMCExpr::VK_RISCV_None to RISCVMCExpr::VK_None (#131774 ) Fix RISCVMCExpr::VK_RISCV_None which were added in #130779	2025-03-18 16:43:42 +05:30
Diana Picus	0a21ef9536	[AMDGPU] Add SubtargetFeature for dynamic VGPR mode (#130030 ) This represents a hardware mode supported only for wave32 compute shaders. When enabled, we set the `.dynamic_vgpr_en` field of `.compute_registers` to true in the PAL metadata. This will be changed to use an attribute after downstream consumers have been migrated.	2025-03-18 11:48:01 +01:00
Balazs Benics	5865807421	Reapply "[analyzer] Delay the checker constructions after parsing" (#128369 ) Reapply "[analyzer] Delay the checker constructions after parsing" (#128350) This reverts commit db836edf47f36ed04cab919a7a2c4414f4d0d7e6, as-is. Depends on #128368	2025-03-18 11:40:39 +01:00
Matt Arsenault	c180fc80dc	AMDGPU: Replace unused permlane inputs with poison instead of undef (#131288 )	2025-03-18 17:37:44 +07:00
Matt Arsenault	052eca9ff7	AMDGPU: Replace unused update.dpp inputs with poison instead of undef (#131287 )	2025-03-18 17:33:58 +07:00
Matt Arsenault	8392573469	AMDGPU: Replace unused export inputs with poison instead of undef (#131286 )	2025-03-18 17:30:42 +07:00
Matt Arsenault	c5fe075eaf	AMDGPU: Use freeze poison instead of undef in alloca promotion (#131285 ) Previously the value created to represent the uninitialized memory of the alloca was undef. Use freeze poison instead. Enables some optimization improvements (which need defeating in the limit tests), but also a few regressions. Seems to leave behind dead code in some cases too.	2025-03-18 17:27:02 +07:00
cor3ntin	f7716047c6	[Clang][NFC] Cleanup UnaryExprOrTypeTraitExpr itanium mangling code (#131764 ) Just removing some code duplication. Extracted from #131515	2025-03-18 11:26:48 +01:00
Benjamin Maxwell	f406b28f8b	[AArch64][SVE] Fold integer lane extract and store to FPR store (#129756 ) This helps avoid pointless fmovs to GPRs, which may be slow, especially in streaming mode.	2025-03-18 10:10:23 +00:00
Valery Pykhtin	4ad0aa73b7	[SSAUpdaterBulk] Add expectedly failing loop tests. (#131761 ) These tests demonstrate the issue in SSAUpdaterBulk when it calculates incoming values from loop back edges. The failures are marked with `EXPECT_NONFATAL_FAILURE`, which is the way to designate an "expected fail" in the Google Test suite.	2025-03-18 10:55:56 +01:00
Kareem Ergawy	1094ffcafb	[flang][fir] Add MLIR op for `do concurrent` (#130893 ) Adds new MLIR ops to model `do concurrent`. In order to make `do concurrent` representation self-contained, a loop is modeled using 2 ops, one wrapper and one that contains the actual body of the loop. For example, a 2D `do concurrent` loop is modeled as follows: ```mlir fir.do_concurrent { %i = fir.alloca i32 %j = fir.alloca i32 fir.do_concurrent.loop (%i_iv, %j_iv) = (%i_lb, %j_lb) to (%i_ub, %j_ub) step (%i_st, %j_st) { %0 = fir.convert %i_iv : (index) -> i32 fir.store %0 to %i : !fir.ref<i32> %1 = fir.convert %j_iv : (index) -> i32 fir.store %1 to %j : !fir.ref<i32> } } ``` The `fir.do_concurrent` wrapper op encapsulates both the actual loop and the allocations required for the iteration variables. The `fir.do_concurrent.loop` op is a multi-dimensional op that contains the loop control and body. See the ops' docs for more info.	2025-03-18 10:53:44 +01:00
quic_hchandel	036c6cb37c	[RISCV] Add Qualcomm uC Xqcibi (Branch Immediate) extension (#130779 ) This extension adds twelve conditional branch instructions that use an immediate operand for the source. The current spec can be found at: https://github.com/quic/riscv-unified-db/releases/tag/Xqci-0.7.0 This patch adds assembler only support. Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>	2025-03-18 15:18:43 +05:30
David Sherwood	194eceff43	update_test_checks: add new --filter-out-after option (#129739 ) Whilst trying to clean up some loop vectoriser IR tests (see test/Transforms/LoopVectorize/AArch64/partial-reduce-chained.ll for example) a reviewer on PR #129047 suggested it would be nice to have an option to stop generating CHECK lines after a certain point. Typically when performing a transformation with the loop vectoriser we don't usually care about any CHECK lines generated for the scalar tail of the loop, since the scalar loop is kept intact. Previously if you wanted to eliminate such unwanted CHECK lines you had to run the update script, then manually delete all the lines corresponding to the scalar loop. This can be very time consuming if the tests ever need changing. What I've tried to do here is add a new --filter-out-after option alongside the existing --filter* options that provides support for stopping the generation of any CHECK lines beyond the line that matches the filter. With the existing filter options we never generate CHECK-NEXT lines, but we still care about ordering with --filter-out-after so I've amended the code to ensure we treat this filter differently.	2025-03-18 09:46:43 +00:00
Srinivasa Ravi	c42952a782	[MLIR][NVVM] Add support for match.sync Op (#130718 ) This change adds the `match.sync` Op to the MLIR NVVM dialect to generate the `match.sync` PTX instruction. PTX Spec Reference: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#parallel-synchronization-and-communication-instructions-match-sync	2025-03-18 14:54:24 +05:30
Kareem Ergawy	49b8d8472f	[OpenMP][MLIR] Support LLVM translation for `distribute` with delayed privatization (#131564 ) Adds support for tranlating delayed privatization (`private` and `firstprivate`) for `omp.distribute` ops.	2025-03-18 10:14:42 +01:00
Lucas Duarte Prates	44e4b27aec	[clang] Fix darwin-related tests' REQUIRES annotation (#130138 ) The tests updated by this commit were designed to check features in the clang's driver and index that require clang to be targgeting a darwin platform while running on a darwin host. For that, their execution is currently gated by the `REQUIRES: system-darwin` annotation. This approach becomes a problem when trying to run such tests on a cross-compiling build of clang on a darwin platform. When the default target is not darwin (e.g. via `LLVM_DEFAULT_TARGET_TRIPLE `), the tests will still run on a darwin host and fail spuriously because of the mismatch with the target detection. To fix this issue, this patch introduces an extra condition to the tests' REQUIRES annotation, `target={{.}}-{{darwin\|macos}}{{.}}`, ensuring they only run when the relevant target is present.	2025-03-18 09:11:43 +00:00
David Green	bd1be8a242	[CodeGen][GlobalISel] Add a getVectorIdxWidth and getVectorIdxLLT. (#131526 ) From #106446, this adds a variant of getVectorIdxTy that returns an LLT. Many uses only look at the width, so a getVectorIdxWidth was added as the common base.	2025-03-18 08:31:11 +00:00
Matthias Springer	e614e840bc	[mlir][memref] Add runtime verification for `memref.dim` (#130410 ) Add runtime verification for `memref.dim`: check that the index is in bounds. Also simplify the pass pipeline for all memref runtime verification checks.	2025-03-18 09:10:49 +01:00
Mel Chen	489d1e764e	[LV][NFC] Pre-commit test for supporting strided accesses. (#130563 ) Duplicate riscv-vector-reverse.ll as riscv-vector-reverse-output.ll to verify all generated IR, not just debug output. Pre-commit for #128718.	2025-03-18 16:08:42 +08:00
lorenzo chelini	57dc71352c	[MLIR][Bufferization] Retire `enforce-aliasing-invariants` (#130929 ) Why? This option can lead to incorrect IR if used in isolation, for example, consider the IR below: ```mlir func.func @loop_with_aliasing(%arg0: tensor<5xf32>, %arg1: index, %arg2: index) -> tensor<5xf32> { %c1 = arith.constant 1 : index %cst = arith.constant 1.000000e+00 : f32 %0 = tensor.empty() : tensor<5xf32> %1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<5xf32>) -> tensor<5xf32> // The BufferizableOpInterface says that %2 alias with %arg0 or be a newly // allocated buffer %2 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (tensor<5xf32>) { scf.yield %1 : tensor<5xf32> } %cst_0 = arith.constant 1.000000e+00 : f32 %inserted = tensor.insert %cst_0 into %1[%c1] : tensor<5xf32> return %2 : tensor<5xf32> } ``` If we bufferize with: enforce-aliasing-invariants=false, we get: ``` func.func @loop_with_aliasing(%arg0: memref<5xf32, strided<[?], offset: ?>>, %arg1: index, %arg2: index) -> memref<5xf32, strided<[?], offset: ?>> { %c1 = arith.constant 1 : index %cst = arith.constant 1.000000e+00 : f32 %alloc = memref.alloc() {alignment = 64 : i64} : memref<5xf32> linalg.fill ins(%cst : f32) outs(%alloc : memref<5xf32>) %0 = scf.for %arg3 = %arg1 to %arg2 step %c1 iter_args(%arg4 = %arg0) -> (memref<5xf32, strided<[?], offset: ?>>) { %cast = memref.cast %alloc : memref<5xf32> to memref<5xf32, strided<[?], offset: ?>> scf.yield %cast : memref<5xf32, strided<[?], offset: ?>> } %cst_0 = arith.constant 1.000000e+00 : f32 memref.store %cst_0, %alloc[%c1] : memref<5xf32> return %0 : memref<5xf32, strided<[?], offset: ?>> } ``` Which is not correct IR since the loop yields the allocation. I am using this option. What do I need to do now? If you are using this option in isolation, you are possibly generating incorrect IR, so you need to revisit your bufferization strategy. If you are using it together with `copyBeforeWrite,` you simply need to retire the `enforceAliasingInvariants` option. Co-authored-by: Matthias Springer <mspringer@nvidia.com>	2025-03-18 08:42:43 +01:00
Kazu Hirata	58dd3eda4e	[Utils] Avoid repeated hash lookups (NFC) (#131723 )	2025-03-18 00:27:23 -07:00
Kazu Hirata	62204482c0	[CodeGen] Avoid repeated hash lookups (NFC) (#131722 )	2025-03-18 00:26:59 -07:00
Kazu Hirata	f6ad65a824	[ADT] Add SmallPtrSet::insert_range (#131716 ) This pach adds SmallPtrSet::insert_range for consistency with DenseSet::insert_range and std::set::insert_range from C++23.	2025-03-18 00:21:07 -07:00
Kazu Hirata	2df0254828	[ADT] Add SmallSet::insert_range (#131717 ) This patch adds SmallSet::insert_range for consistency with DenseSet::insert_range and std::set::insert_range from C++23.	2025-03-18 00:20:15 -07:00
Kazu Hirata	fc38982e93	[ADT] Add SetVector::insert_range (#131715 ) This patch adds SetVector::insert_range for consistency with DenseSet::insert_range and std::set::insert_range from C++23.	2025-03-18 00:19:48 -07:00
Petr Hosek	1fbfef9b8a	[libc] Templatize the scanf Reader interface (#131037 ) This allows specializing the implementation for different targets without including unnecessary logic and is similar to #111559 which did the same for printf Writer interface.	2025-03-17 23:51:24 -07:00
Ryosuke Niwa	4781941160	[alpha.webkit.UncountedCallArgsChecker] os_log functions should be treated as safe. (#131500 ) …os_log functions should be treated as safe in call arguments checkers. Also treat __builtin_* functions and __libcpp_verbose_abort functions as "trivial" for the purpose in call argument checkers.	2025-03-17 23:47:10 -07:00
Timm Baeder	2f808dd070	[clang][bytecode] Compile most recent function decl (#131730 ) We used to always do this because all calls went through the code path that calls getMostRecentDecl(). Do it now, too.	2025-03-18 07:29:38 +01:00
Akshat Oke	6be6400848	[LiveDebugValues][NFC] Remove TargetPassConfig from LDVImpl (#131562 ) TPC is only used to access the option `ShouldEmitDebugEntryValues`.	2025-03-18 11:04:54 +05:30
Sushant Gokhale	0f34eba48a	[NFC][AArch64] test for fixed-width vector signed division with pow2-divisor and SVE enabled (#130252 ) With SVE enabled, this should generate asrd instruction. Subsequent patch will address this.	2025-03-17 22:31:22 -07:00
Vikash Gupta	bdb63208b4	[AMDGPU][CodeGen] Using MBB's liveIn check in tandem with MCRegAliasIterator in SILowerSGPRSpills (#129848 ) This patch replaces use of MachineRegisterInfo's liveIn check with the machine basicBlock's liveIn. As the MRI's liveIn is inconsistent with the entry MBB liveIns, when it comes to the machine verifier checks. PS: Its an alternative solution with respect to #126926.	2025-03-18 10:51:07 +05:30
Craig Topper	0813c5cf5f	[RISCV] Accept '0(reg)' in addition to '(reg)' for vle1.v/vse1.v	2025-03-17 20:51:16 -07:00
Fangrui Song	b9d27ac252	[MC] Fix formatting of a comment	2025-03-17 20:24:08 -07:00
Kazu Hirata	c72f7958b0	[BOLT] Fix the build This is a follow-up for: commit 3c4b9317916ccd2e18c30b1540589518a4c7c88a Author: Fangrui Song <i@maskray.me> Date: Mon Mar 17 20:05:28 2025 -0700	2025-03-17 20:18:34 -07:00
Fangrui Song	e758237352	[docs] Mention --discard-locals/--discard-all change for llvm-strip PR #130704 updated llvm-strip as well. Suggested by @nga888 Pull Request: https://github.com/llvm/llvm-project/pull/131491	2025-03-17 20:09:52 -07:00
Fangrui Song	3c4b931791	Rename RISCVMCExpr::VK_RISCV_ to VK_. NFC They implement relocation operators and are named VK_RISCV_ probably to avoid confusion with `MCSymbolRefExpr::VariantKind`. `MCSymbolRefExpr::VariantKind` is discouraged (https://discourse.llvm.org/t/error-expected-relocatable-expression-with-mctargetexpr/84926/2) and targets are migrating away from `MCSymbolRefExpr::VariantKind`. Therefore, there is no need to make the name long in the presence of the clear `RISCVMCExpr::` prefix. Pull Request: https://github.com/llvm/llvm-project/pull/131489	2025-03-17 20:05:28 -07:00
Louis Dionne	297f6d9f6b	[libc++] Fix check for _LIBCPP_HAS_NO_WIDE_CHARACTERS in features.py (#131675 ) The patch that added the new locale Lit features was created before we switched to a 0-1 macro for _LIBCPP_HAS_WIDE_CHARACTERS, leading to that patch referring to the obsolete _LIBCPP_HAS_NO_WIDE_CHARACTERS macro that is never defined nowadays.	2025-03-17 22:13:51 -04:00
Elvis Wang	ed19620b8c	[VPlan] Make VPReductionRecipe a VPRecipeWithIRFlags. NFC (#130881 ) This patch change the parent of the VPReductionRecipe from VPSingleDefRecipe to VPRecipeWithIRFlags and also print/get/drop/control flags by the VPRecipeWithIRFlags. This will remove the dependency of the underlying instruction. This patch also add a new function `setFastMathFlags()` to the VPRecipeWithIRFlags because the entire reduction chain may contains multiple instructions. And the underlying instruction may not contains the corresponding flags for this reduction. Split from #113903.	2025-03-18 10:08:23 +08:00
Jim Lin	00cad3ed22	[SDAG] Handle extract_subvector in isKnownNeverNaN (#131581 ) Propagate nnan across extract_subvector.	2025-03-18 09:37:16 +08:00
Tim Gymnich	a5107be031	[NFC][AMDGPU][GlobalISel] Make LLTs constexpr (#131673 ) - static const -> constexpr	2025-03-18 08:30:17 +07:00
Longsheng Mou	4cb1430c1c	[mlir][spirv] Fix a crash in `spirv::ISubOp::fold` (#131570 ) This PR fixes a crash if `spirv.ISub` is not integer type. Fixes #131283.	2025-03-18 09:18:49 +08:00
Nikolay Panchenko	745e16753f	[JSON][NFC] Move print method out of NDEBUG \|\| DUMP (#131639 )	2025-03-17 21:12:07 -04:00
William Moses	d9c65af626	[MLIR][GPUToNVVM] Support 32-bit isfinite (#131699 ) Co-authored-by: Ivan Radanov Ivanov <ivanov.i.aa@m.titech.ac.jp>	2025-03-18 02:11:38 +01:00
Alexander Kornienko	d1156fcb56	Revert "[libc++] Optimize num_put integral functions" (#131613 ) Reverts llvm/llvm-project#120859 This change breaks formatting of `0` with `std::showbase` + `std::hex` or `std::oct`, as well as `+0` with `std::showpos`. I believe the new behavior is violating the standard. See https://github.com/llvm/llvm-project/pull/120859#issuecomment-2723970242 and later comments for details and explanation.	2025-03-18 02:05:51 +01:00
Craig Topper	50f8adb5c0	[RISCV] Accept '0(reg)' in addition to '(reg)' ifor vl1r.v/vl2r.v/vl4r.v/vl8r.v This matches vl1re8.v, vl2re8.v, vl4re8.v, vl8re8.v.	2025-03-17 17:39:56 -07:00
Cyndy Ishida	cb1d640b03	[clang][DepScan] resolve dangling reference to lambda that goes out of scope. Fixes buildbots.	2025-03-17 17:33:58 -07:00
Farzon Lotfi	a2fbc9a8e3	[DirectX] Start the creation of a DXIL Instruction legalizer (#131221 ) - Legalize i8 truncation back to original types - remove sext and truncs - Legalize i64 indicies for insert\extract elements to i32 indicies - fixes https://github.com/llvm/llvm-project/issues/126323 - fixes https://github.com/llvm/llvm-project/issues/129757	2025-03-17 20:33:02 -04:00
Matt Arsenault	092e25571c	AMDGPU: Add REQUIRES: asserts to machine pass violation test We should promote this to a proper error and not llvm_unreachable	2025-03-18 07:31:50 +07:00

... 4 5 6 7 8 ...

531239 Commits