llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 18:16:35 +00:00

Author	SHA1	Message	Date
Alexander Yermolovich	4f902d2425	[llvm-dwarfdump] Make --verify for .debug_names multithreaded. (#127281 ) This PR makes verification of .debug_names acceleration table multithreaded. In local testing it improves verification of clang .debug_names from four minutes to under a minute. This PR relies on a current mechanism of extracting DIEs into a vector. Future improvements can include creating API to extract one DIE at a time, or grouping Entires into buckets by CUs and extracting before parallel step. Single Thread 4:12.37 real, 246.88 user, 3.54 sys, 0 amem,10232004 mmem Multi Thread 0:49.40 real, 612.84 user, 515.73 sys, 0 amem, 11226292 mmem	2025-04-03 14:02:27 -07:00
Henry Jiang	7d3dfc862d	[JITLink][XCOFF] Setup initial build support for XCOFF (#127266 ) This patch starts the initial implementation of JITLink for XCOFF (Object format for AIX).	2025-04-03 17:01:18 -04:00
Jonas Devlieghere	5f99e0d4b9	[lldb] Use the "reverse video" effect when colors are disabled. (#134203 ) When you run lldb without colors (`-X`), the status line looks weird because it doesn't have a background. You end up with what appears to be floating text at the bottom of your terminal. This patch changes the statusline to use the reverse video effect, even when colors are off. The effect doesn't introduce any new colors and just inverts the foreground and background color. I considered an alternative approach which changes the behavior of the `-X` option, so that turning off colors doesn't prevent emitting non-color related control characters such as bold, underline, and reverse video. I decided to go with this more targeted fix as (1) nobody is asking for this more general change and (2) it introduces significant complexity to plumb this through using a setting and driver flag so that it can be disabled when running the tests. Fixes #134112.	2025-04-03 13:51:17 -07:00
Florian Hahn	cdff7f0b6e	[LV] Retrieve middle VPBB via scalar ph to fix epilogue resumephis (NFC) If ScalarPH has predecessors, we may need to update its reduction resume values. If there is a middle block, it must be the first predecessor. Note that the first predecessor may not be the middle block, if the middle block doesn't branch to the scalar preheader. In that case, fixReductionScalarResumeWhenVectorizingEpilog will be a no-op. In preparation for https://github.com/llvm/llvm-project/pull/106748.	2025-04-03 21:46:48 +01:00
Mircea Trofin	61768b3528	[ctxprof] Don't import roots elsewhere (#134012 ) Block a context root from being imported by its callers. Suppose that happened. Its caller - usually a message pump - inlines its copy of the root. Then it (the root) and whatever it calls will be the non-contextually optimized callee versions.	2025-04-03 13:21:39 -07:00
Hristo Hristov	b93376f899	[libc++][type_traits] `reference_{constructs\|converts}_from_temporary` with `-Winvalid-specialization` tests (#133946 ) Addresses comment: https://github.com/llvm/llvm-project/pull/128649/files#r2022341035 --------- Co-authored-by: Hristo Hristov <zingam@outlook.com>	2025-04-03 23:18:04 +03:00
Alexey Bataev	daab7d0807	[SLP]Initial support for (masked)loads + compress and (masked)interleaved Added initial support for (masked)loads + compress and (masked)interleaved loads. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/132099	2025-04-03 13:17:40 -07:00
Florian Hahn	012e574d4d	[LV] Add FindLastIV test with truncated IV and epilogue vectorization. This adds missing test coverage for https://github.com/llvm/llvm-project/pull/132691.	2025-04-03 21:01:58 +01:00
Alexey Bataev	7c4013d591	Revert "[SLP]Initial support for (masked)loads + compress and (masked)interleaved" This reverts commit 0bec0f5c059af5f920fe22ecda469b666b5971b0 to fix a crash reported in https://lab.llvm.org/buildbot/#/builders/143/builds/6668.	2025-04-03 12:58:49 -07:00
zcfh	229ca7dbcb	[memprof] Report an error when buildid and profile do not match (#132504 ) ## Problem When the build ids of the profile and binary do not match, the error reported by llvm-profdata is `no entries in callstack map after symbolization`, but the root cause of this problem is the build id mismatch. ## Trigger scenario For example, when performing `memprof` optimization on `clang`, `rawprofile` is collected through `ninja clang`. In addition to running clang, some other programs will also be executed, and these programs will also generate rawprofile. When `no entries in callstack map after symbolization` appears during `llvm-profdata merge`, users may mistakenly think that the instrumentation failed or other reasons, and will not directly realize that the binary and profile do not match. ## Changed Currently, when the build id does not match, an assert error is triggered only in debug mode. Change it to directly return an error when the build id does not match.	2025-04-03 12:48:27 -07:00
Valentin Clement (バレンタインクレメン)	7288f1bc32	[flang][cuda] Use nvvm operation for match any (#134283 ) The string used for intrinsic was not the correct one "llvm.nvvm.match.any.sync.i32p". There was an extra `p` at the end. Use the NVVM operation instead so we don't duplicate it.	2025-04-03 12:08:30 -07:00
Rahul Joshi	b393ca6026	[NFC][LLVM][RISCV] Cleanup pass initialization for RISCV (#134279 ) - Move calls to pass initialization functions to RISCV target initialization and remove them from pass constructors.	2025-04-03 11:28:45 -07:00
Jorge Gorbe Moya	158684a80f	[bazel] Add missing dep after 586c5e3083428e7473e880dafd5939e8707bc1c9	2025-04-03 11:25:44 -07:00
Slava Zakharin	b8b752db2b	[flang][NFC] Create required Source dir for flang-doc. (#134000 )	2025-04-03 10:43:49 -07:00
Slava Zakharin	3f6ae3f0a8	[flang] Added driver options for arrays repacking. (#134002 ) Added options: * -f[no-]repack-arrays * -f[no-]stack-repack-arrays * -frepack-arrays-contiguity=whole/innermost	2025-04-03 10:43:28 -07:00
Valentin Clement (バレンタインクレメン)	3e59ff27e5	[flang][cuda] Fix pred type for vote functions (#134166 )	2025-04-03 10:33:09 -07:00
Matheus Izvekov	cfee056b4e	[clang] NFC: introduce UnsignedOrNone as a replacement for std::optional<unsigned> (#134142 ) This introduces a new class 'UnsignedOrNone', which models a lite version of `std::optional<unsigned>`, but has the same size as 'unsigned'. This replaces most uses of `std::optional<unsigned>`, and similar schemes utilizing 'int' and '-1' as sentinel. Besides the smaller size advantage, this is simpler to serialize, as its internal representation is a single unsigned int as well.	2025-04-03 14:27:18 -03:00
Amr Hesham	262b9b5153	[CIR][Upstream] Local initialization for ArrayType (#132974 ) This change adds local initialization for ArrayType Issue #130197	2025-04-03 19:25:25 +02:00
Sterling-Augustine	7514225052	Use a more proper idiom for "the output file doesn't matter". NFC. (#134280 ) As in the description. Follow up to PR #134179.	2025-04-03 10:24:10 -07:00
zhijian lin	1a540c3b8b	[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#133155 ) ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated, using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO, UADDO_CARRY, USUBO, USUBO_CARRY in the patch.	2025-04-03 13:22:49 -04:00
Alexey Bataev	0bec0f5c05	[SLP]Initial support for (masked)loads + compress and (masked)interleaved Added initial support for (masked)loads + compress and (masked)interleaved loads. Reviewers: RKSimon, hiraditya Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/132099	2025-04-03 13:21:22 -04:00
Austin Schuh	2abcdd8cf0	[CUDA] Add support for CUDA surfaces (#132883 ) This adds support for all the surface read and write calls to clang. It extends the pattern used for textures to surfaces too. I tested this by generating all the various permutations of the calls and argument types in a python script, compiling them with both clang and nvcc, and comparing the generated ptx for equivilence. They all agree, ignoring register allocation, and some places where Clang picks different memory write instructions. An example kernel is: ``` __global__ void testKernel(cudaSurfaceObject_t surfObj, int x, float2* result) { *result = surf1Dread<float2>(surfObj, x, cudaBoundaryModeZero); } ``` --------- Signed-off-by: Austin Schuh <austin.linux@gmail.com>	2025-04-03 10:08:02 -07:00
Luke Lau	9a5b0f302b	Reapply "[InstCombine] Match scalable splats in m_ImmConstant (#132522 )" (#134262 ) This reapplies #132522. Previously casts of scalable m_ImmConstant splats weren't being folded by ConstantFoldCastOperand, triggering the "Constant-fold of ImmConstant should not fail" assertion. There are no changes to the code in this PR, instead we just needed #133207 to land first. A test has been added for the assertion in llvm/test/Transforms/InstSimplify/vec-icmp-of-cast.ll @icmp_ult_sext_scalable_splat_is_true. <hr/> #118806 fixed an infinite loop in FoldShiftByConstant that could occur when the shift amount was a ConstantExpr. However this meant that FoldShiftByConstant no longer kicked in for scalable vectors because scalable splats are represented by ConstantExprs. This fixes it by allowing scalable splats of non-ConstantExprs in m_ImmConstant, which also fixes a few other test cases where scalable splats were being missed. But I'm also hoping that UseConstantIntForScalableSplat will eventually remove the need for this. I noticed this when trying to reverse a combine on RISC-V in #132245, and saw that the resulting vector and scalar forms were different.	2025-04-03 18:03:16 +01:00
Matt Arsenault	a54736afd5	CloneFunction: Do not delete blocks with address taken (#134209 ) If a block with a single predecessor also had its address taken, it was getting deleted in this post-inline cleanup step. This would result in the blockaddress in the resulting function getting deleted and replaced with inttoptr 1. This fixes one bug required to permit inlining of functions with blockaddress uses. At the moment this is not testable (at least without an annoyingly complex unit test), and is a pre-bug fix for future patches. Functions with blockaddress uses are rejected in isInlineViable, so we don't get this far with the current InlineFunction uses (some of the existing cases seem to reproduce this part of the rejection logic, like PartialInliner). This will be tested in a pending llvm-reduce change. Prerequisite for #38908	2025-04-03 23:52:25 +07:00
Christian Sigg	6ddf7cf780	[mlir][bazel] Allow `gentbl_cc_library(tbl_outs)` to be a dict. (#134271 ) This makes the BUILD file shorter and more readable. I will follow up with converting the other instances.	2025-04-03 18:47:56 +02:00
MaheshRavishankar	a1bc979aa8	[mlir][Bufferization] Do not have read semantics for destination of `tensor.parallel_insert_slice`. (#134169 ) `tensor.insert_slice` needs to have read semantics on its destination operand. Since it has a return value, its semantics are - Copy dest to result - Copy source to subview of destination. `tensor.parallel_insert_slice` though has no result. So it does not need to have read semantics. The op description [here](`a3ac318e5f/mlir/include/mlir/Dialect/Tensor/IR/TensorOps.td (L1524)`) also says that it is expected to lower to a `memref.subview`, that does not have read semantics on the destination (its just a view). This patch drops the read semantics for destination of `tensor.parallel_insert_slice` but also makes the `shared_outs` operands of `scf.forall` have read semantics. Earlier it would rely indirectly on read semantics of destination operand of `tensor.parallel_insert_slice` to propagate the read semantics for `shared_outs`. Now that is specified more directly. Fixes #133964 --------- Signed-off-by: MaheshRavishankar <mahesh.ravishankar@gmail.com>	2025-04-03 09:47:36 -07:00
John Harrison	bc6cd825ec	[lldb-dap] Creating a common configuration structure for launch and attach requests. (#133960 ) This moves all the common settings of the launch and attach operations into the `lldb_dap::protocol::Configuration`. These common settings can be in both `launch` and `attach` requests and allows us to isolate the DAP configuration operations into a single common location. This is split out from #133624.	2025-04-03 09:45:00 -07:00
Finn Plummer	73e8d67a20	Revert "[HLSL][RootSignature] Define and integrate `HLSLRootSignatureAttr`" (#134273 ) Reverts llvm/llvm-project#134124 The build is failing again to a linking error: [here](https://github.com/llvm/llvm-project/pull/134124#issuecomment-2776370486). Again the error was not present locally or any of the pre-merge builds and must have been transitively linked in these build environments...	2025-04-03 09:40:50 -07:00
Simon Pilgrim	2190808f5d	[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV/VPERMV3 nodes if the upper elements are not demanded (REAPPLIED) (#134263 ) With AVX512VL targets, use 128/256-bit VPERMV/VPERMV3 nodes when we only need the lower elements. Reapplied version of #133923 with fix for typo in the VPERMV3 mask adjustment	2025-04-03 17:39:38 +01:00
Connector Switch	b738b82699	[libc] Combine the function prototype `int (compar)(const void , const void *)` (#134238 ) Closes #134118.	2025-04-04 00:36:23 +08:00
Finn Plummer	65fa57bdcc	[HLSL][RootSignature] Define and integrate `HLSLRootSignatureAttr` (#134124 ) - Defines HLSLRootSignature Attr in `Attr.td` - Define and implement handleHLSLRootSignature in `SemaHLSL` - Adds sample test case to show AST Node is generated in `RootSignatures-AST.hlsl` This commit will "hook-up" the seperately defined RootSignature parser and invoke it to create the RootElements, then store them on the ASTContext and finally store the reference to the Elements in RootSignatureAttr Resolves https://github.com/llvm/llvm-project/issues/119011 --------- Co-authored-by: Finn Plummer <finnplummer@microsoft.com>	2025-04-03 09:27:54 -07:00
Brox Chen	bf388f8a43	[AMDGPU][True16][CodeGen] legalize operands when move16bit SALU to VALU (#133985 ) This is a follow up PR from https://github.com/llvm/llvm-project/pull/132089. When a V2S copy and its useMI are lowered to VALU, this patch check: If the generated new VALU is a true16 inst. Add subreg access on all operands if necessary. an example MIR looks like: ``` %1:vgpr_32 = V_CVT_F32_U32_e64 %0:vgpr_32, 0, 0 ... %2:sreg_32 = COPY %1:vgpr_32 %3:sreg_32 = S_FLOOR_F16 %2:sreg_32, ... ``` currently lowered to ``` %1:vgpr_32 = V_CVT_F32_U32_e64 %0:vgpr_32, 0, 0 ... %2:vgpr_16 = V_FLOOR_F16_t16_e64 0, %1:vgpr_32, 0, 0, 0 ... ``` after this patch ``` %1:vgpr_32 = V_CVT_F32_U32_e64 %0:vgpr_32, 0, 0 ... %2:vgpr_16 = V_FLOOR_F16_t16_e64 0, %1.lo16:vgpr_32, 0, 0, 0 ... ```	2025-04-03 12:26:41 -04:00
Jonas Devlieghere	bec5cfd970	[lldb-dap] Protect SetBreakpoint with the API mutex (#134030 ) Protect the various SetBreakpoint functions with the API mutex. This fixes a race condition between the breakpoint being created and the DAP label getting added. This was causing `TestDAP_breakpointEvents.py` to be flaky. Fixes #131242.	2025-04-03 09:08:23 -07:00
Rahul Joshi	3801bf6164	[NFC] Cleanup pass initialization for SPIRV passes (#134189 ) - Do not call pass initialization functions from pass contructors. - Instead, call them from SPIRV target initialization. - https://github.com/llvm/llvm-project/issues/111767	2025-04-03 08:50:31 -07:00
Asher Mancinelli	d7d91500b6	[flang][nfc] Initial changes needed to use llvm intrinsics instead of regular calls (#134170 ) Flang uses `fir.call <llvm intrinsic>` in a few places. This means consumers of the IR need to strcmp every fir.call if they want to find a particular LLVM intrinsic. Emit LLVM memcpy intrinsics instead.	2025-04-03 08:37:40 -07:00
Matheus Izvekov	49fd0bf35d	[clang] support pack expansions for trailing requires clauses (#133190 )	2025-04-03 12:36:15 -03:00
Julian Brown	c1ada72b09	[OpenMP] Mark 'map-type modifiers in arbitrary position' done (#133906 ) I think #90499 already implements support for the listed OpenMP 6.0 feature mentioned in the title. This patch just marks it done (for C/C++).	2025-04-03 16:34:35 +01:00
Simon Pilgrim	12f75bba41	Revert "[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV/VPERMV3 nodes if the upper elements are not demanded" (#134256 ) Found a typo in the VPERMV3 mask adjustment - I'm going to revert and re-apply the patch with a fix Reverts llvm/llvm-project#133923	2025-04-03 16:28:24 +01:00
Luke Lau	79435de8a5	[ConstantFold] Support scalable constant splats in ConstantFoldCastInstruction (#133207 ) Previously only fixed vector splats were handled. This adds supports for scalable vectors too by allowing ConstantExpr splats. We need to add the extra V->getType()->isVectorTy() check because a ConstantExpr might be a scalar to vector bitcast. By allowing ConstantExprs this also allow fixed vector ConstantExprs to be folded, which causes the diffs in llvm/test/Analysis/ValueTracking/known-bits-from-operator-constexpr.ll and llvm/test/Transforms/InstSimplify/ConstProp/cast-vector.ll. I can remove them from this PR if reviewers would prefer. Fixes #132922	2025-04-03 16:24:56 +01:00
Daniel Chen	2080334574	[flang-rt] Pass the whole path of libflang_rt.runtime.a to linker on AIX and LoP (#131041 ) This PR is to improve the driver code to build `flang-rt` path by re-using the logic and code of `compiler-rt`. 1. Moved `addFortranRuntimeLibraryPath` and `addFortranRuntimeLibs` to `ToolChain.h` and made them virtual so that they can be overridden if customization is needed. The current implementation of those two procedures is moved to `ToolChain.cpp` as the base implementation to default to. 2. Both AIX and PPCLinux now override `addFortranRuntimeLibs`. The overriding function of `addFortranRuntimeLibs` for both AIX and PPCLinux calls `getCompilerRTArgString` => `getCompilerRT` => `buildCompilerRTBasename` to get the path to `flang-rt`. This code handles `LLVM_ENABLE_PER_TARGET_RUNTIME_DIR` setting. As shown in `PPCLinux.cpp`, `FT_static` is the default. If not found, it will search and build for `FT_shared`. To differentiate `flang-rt` from `clang-rt`, a boolean flag `IsFortran` is passed to the chain of functions in order to reach `buildCompilerRTBasename`.	2025-04-03 11:21:19 -04:00
Krzysztof Drewniak	f23bb530cf	[AMDGPULowerBufferFatPointers] Use InstSimplifyFolder during rewrites (#134137 ) This PR updates AMDGPULowerBufferFatPointers to use the InstSimplifyFolder when creating IR during buffer fat pointer lowering. This shouldn't cause any large functional changes and might improve the quality of the generated code.	2025-04-03 10:12:18 -05:00
Stephen Tozer	2334fd2ea3	[Dexter] Update Dexter tests to use new dexter test substitutions Following commit b8fc288, which changed some dexter test substitutions to be specific to C and C++, some tests that had been added since the original patch was written were still using the old substitution; this patch updates them to use the new.	2025-04-03 16:05:42 +01:00
gbMattN	61ef286506	Fix signed/unsigned mismatch warning (#134255 )	2025-04-03 15:56:33 +01:00
Felipe de Azevedo Piovezan	c14b6e90bd	[lldb][NFC] Move ShouldShow/ShouldSelect logic into Stopinfo (#134160 ) This NFC patch simplifies the main loop in HandleProcessStateChanged event by moving duplicated code into the StopInfo class, also allowing StopInfo subclasses to override behavior. More specifically, two functions are created: * ShouldShow: should a Thread with such StopInfo should be printed when the debugger stops? Currently, no StopInfo subclasses override this, but a subsequent patch will fix a bug by making StopInfoBreakpoint check whether the breakpoint is internal. * ShouldSelect: should a Thread with such a StopInfo be selected? This is currently overridden by StopInfoUnixSignal but will, in the future, be overridden by StopInfoBreakpoint.	2025-04-03 07:41:29 -07:00
Sergio Afonso	f59b5b8d59	[MLIR][OpenMP] Fix standalone distribute on the device (#133094 ) This patch updates the handling of target regions to set trip counts and kernel execution modes properly, based on clang's behavior. This fixes a race condition on `target teams distribute` constructs with no `parallel do` loop inside. This is how kernels are classified, after changes introduced in this patch: ```f90 ! Exec mode: SPMD. ! Trip count: Set. !$omp target teams distribute parallel do do i=... end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel do private(idx, y) do j=... end do end do ! Exec mode: Generic-SPMD. ! Trip count: Set (outer loop). !$omp target teams distribute do i=... !$omp parallel ... !$omp end parallel end do ! Exec mode: Generic. ! Trip count: Set. !$omp target teams distribute do i=... end do ! Exec mode: SPMD. ! Trip count: Not set. !$omp target parallel do do i=... end do ! Exec mode: Generic. ! Trip count: Not set. !$omp target ... !$omp end target ``` For the split `target teams distribute + parallel do` case, clang produces a Generic kernel which gets promoted to Generic-SPMD by the openmp-opt pass. We can't currently replicate that behavior in flang because our codegen for these constructs results in the introduction of calls to the `kmpc_distribute_static_loop` family of functions, instead of `kmpc_distribute_static_init`, which currently prevent promotion of the kernel to Generic-SPMD. For the time being, instead of relying on the openmp-opt pass, we look at the MLIR representation to find the Generic-SPMD pattern and directly tag the kernel as such during codegen. This is what we were already doing, but incorrectly matching other kinds of kernels as such in the process.	2025-04-03 15:41:00 +01:00
Jonas Devlieghere	51c2750599	[lldb] Update examples in docs/use/python-reference.rst to work with Python 3 (#134204 ) The examples on this page were using the Python 2-style print. I ran the updated code examples under Python 3 to confirm they are still up-to-date.	2025-04-03 07:40:00 -07:00
Jake Egan	50fe5b90e7	[sanitizer_common][NFC] Fix sanitizer_symbolizer_libcdep.cpp formatting (#133930 )	2025-04-03 10:39:49 -04:00
Stephen Tozer	b8fc288c46	[Dexter] Replace clang with clang++ in various cross project tests (#65987 ) This patch replaces invocations of clang with clang++ for a set of c++ files in the dexter cross-project tests. As a small additional change, this patch removes -lstdc++ from a test that did not appear to require it.	2025-04-03 15:37:43 +01:00
Nick Sarnie	008040482b	[clang] Add SPIR-V to some OpenMP clang tests (#133503 ) Just to get some more coverage. Some of the behavior might be weird and change in the future, but let's lock down what happens today to at least prevent regressions. Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>	2025-04-03 14:36:46 +00:00
gbMattN	59074a3760	[ASan] Add metadata to renamed instructions so ASan doesn't use the i… (#119387 ) …ncorrect name Clang needs variables to be represented with unique names. This means that if a variable shadows another, its given a different name internally to ensure it has a unique name. If ASan tries to use this name when printing an error, it will print the modified unique name, rather than the variable's source code name Fixes #47326	2025-04-03 15:27:14 +01:00

1 2 3 4 5 ...

532960 Commits