llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-24 03:06:06 +00:00

Author	SHA1	Message	Date
lntue	44b87e4206	[libc] Reduce the range of hypotf exhaustive test to be run automatically. (#133944 ) The current setup of `hypotf` exhaustive tests might take days to finish.	2025-04-01 13:29:28 -04:00
Fraser Cormack	f14ff59da7	[libclc] Move exp, exp2 and expm1 to the CLC library (#133932 ) These all share the use of a common helper function so are handled in one go. These builtins are also now vectorized.	2025-04-01 18:15:37 +01:00
Matt Arsenault	602d05fbe8	llvm-reduce: Make myself maintainer (#133919 )	2025-04-02 00:11:46 +07:00
Arvind Sudarsanam	7003f7d23a	[clang-sycl-linker] Replace llvm-link with API calls (#133797 ) This PR has the following changes: Replace llvm-link with calls to linkInModule to link device files Add -print-linked-module option to dump linked module for testing Added a test to verify that linking is working as expected. We will eventually move to using thin LTO for linking device inputs. Thanks --------- Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>	2025-04-01 17:09:45 +00:00
Valentin Clement (バレンタインクレメン)	01889de8e9	[flang][device] Enable Stop functions on device build (#133803 ) Update `StopStatement` and `StopStatementText` to be build for the device.	2025-04-01 10:06:45 -07:00
Paul Bowen-Huggett	75242a8a1d	[RISCV] Fix the c_slli disassembler test (NFC) (#133921 ) This change fixes the exhaustive text for the c.slli instruction so that each hex pattern now appears only once. The problem was spotted [here](https://github.com/llvm/llvm-project/pull/133713#discussion_r2021577609) by @topperc (for which, thank you).	2025-04-01 10:05:30 -07:00
Matt Arsenault	f60eed9344	llvm-reduce: Add target-features-attr reduction (#133887 ) Try to reduce individual subtarget features in the "target-features" attribute. This attempts a textual removal of the fields in the string, not a semantic removal. Typically there's a lot of redundant feature spam in the feature list implied by the target-cpu (which I really wish clang would stop emitting). If we could parse these out, we could easily drop the fields without testing anything.	2025-04-02 00:03:43 +07:00
Krzysztof Drewniak	25622aa745	[mlir][AMDGPU] Add gfx950 MFMAs to the amdgpu.mfma op (#133553 ) This commit extends the lowering of amdgpu.mfma to handle the new double-rate MFMAs in gfx950 and adds tests for these operations. It also adds support for MFMAs on small floats (f6 and f4), which are implented using the "scaled" MFMA intrinsic with a scale value of 0 in order to have an unscaled MFMA. This commit does not add a `amdgpu.scaled_mfma` operation, as that is future work. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2025-04-01 11:59:09 -05:00
Nirvedh Meshram	69c5049826	[NFC][mlir] Update generate script for conv_3d_ncdhw_fcdhw (#133927 ) https://github.com/llvm/llvm-project/pull/129547 changed the IR directly without updating the auto generate script. Signed-off-by: Nirvedh <nirvedh@gmail.com>	2025-04-01 11:55:40 -05:00
Matt Arsenault	5c4302442b	llvm-reduce: Reduce global variable code model (#133865 ) The current API doesn't have a way to unset it. The query returns an optional, but the set doesn't. Alternatively I could switch the set to also use optional.	2025-04-01 23:54:10 +07:00
jimingham	347c5a7af5	Add a new affordance that the Python module in a dSYM (#133290 ) So the dSYM can be told what target it has been loaded into. When lldb is loading modules, while creating a target, it will run "command script import" on any Python modules in Resources/Python in the dSYM. However, this happens WHILE the target is being created, so it is not yet in the target list. That means that these scripts can't act on the target that they a part of when they get loaded. This patch adds a new python API that lldb will call: __lldb_module_added_to_target if it is defined in the module, passing in the Target the module was being added to, so that code in these dSYM's don't have to guess.	2025-04-01 09:54:06 -07:00
Matt Arsenault	ec290a43f6	llvm-reduce: Reduce externally_initialized (#133859 ) Not sure this is the right place to put it. This is a property of GlobalVariable, not GlobalValue. But the ReduceGlobalVars reduction tries to delete the value entirely.	2025-04-01 23:51:45 +07:00
Brox Chen	dd1d41f833	[AMDGPU][True16][CodeGen] fix moveToVALU with proper subreg access in true16 (#132089 ) There are V2S copies between vpgr16 and spgr32 in true16 mode. This is caused by vgpr16 and sgpr32 both selectable by 16bit src in ISel. When a V2S copy and its useMI are lowered to VALU, this patch check 1. If the generated new VALU is used by a true16 inst. Add subreg access if necessary. 2. Legalize the V2S copy by replacing it to subreg_to_reg an example MIR looks like: ``` %2:sgpr_32 = COPY %1:vgpr_16 %3:sgpr_32 = S_OR_B32 %2:sgpr_32, ... %4:vgpr_16 = V_ADD_F16_t16 %3:sgpr_32, ... ``` currently lowered to ``` %2:vgpr_32 = COPY %1:vgpr_16 %3:vgpr_32 = V_OR_B32 %2:vgpr_32, ... %4:vgpr_16 = V_ADD_F16_t16 %3:vgpr_32, ... ``` after this patch ``` %2:vgpr_32 = SUBREG_TO_REG 0, %1:vgpr_16, lo16 %3:vgpr_32 = V_OR_B32 %2:vgpr_32, ... %4:vgpr_16 = V_ADD_F16_t16 %3.lo16:vgpr_32, ... ```	2025-04-01 12:40:18 -04:00
Petr Hosek	4b19db6db9	Revert "AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function" (#133935 ) Reverts llvm/llvm-project#132684	2025-04-01 09:39:07 -07:00
Fraser Cormack	00e6d4fe06	[libclc][NFC] Delete three unused .inc files	2025-04-01 17:36:01 +01:00
Ivan Butygin	1f194ff34e	[mlir] Expose `simplifyAffineExpr` through python api (#133926 )	2025-04-01 19:28:53 +03:00
Matt Arsenault	7e25b24073	IRNormalizer: Replace cl::opts with pass parameters (#133874 ) Not sure why the "fold-all" option naming didn't match the variable "FoldPreOutputs", but I've preserved the difference. More annoyingly, the pass name "normalize" does not match the pass name IRNormalizer and should probably be fixed one way or the other. Also the existing test coverage for the flags is lacking. I've added a test that shows they parse, but we should have tests that they do something.	2025-04-01 23:27:20 +07:00
lorenzo chelini	105c8c38dc	[MLIR][NFC] Retire let constructor for EmitC (#133732 ) `let constructor` is legacy (do not use in tree!) since the tableGen backend emits most of the glue logic to build a pass.	2025-04-01 18:22:40 +02:00
Jonathan Thackray	558ce50ebc	[Clang][LLVM] Implement multi-single vectors MOP4{A/S} (#129226 ) Implement all multi-single {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the ACLE in https://github.com/ARM-software/acle/pull/381/files	2025-04-01 17:04:59 +01:00
Arthur Eubanks	e8711436b3	[SCEV] Remove EqCacheSCEV (#133186 ) This was added in https://reviews.llvm.org/D26389 to help with extremely deep SCEV expressions. However, this is wrong since we may cache sub-SCEVs to be equivalent that CompareValueComplexity returned 0 due to hitting the max comparison depth. This also improves compile time in some compiles: https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u Similar to #100721. Fixes #130688.	2025-04-01 09:02:33 -07:00
Jeremy Kun	179062b2dc	[mlir][bazel] add alwayslink=True to mlir-runner utils (#133787 ) MacOS platforms using mlir-runner in lit tests consistently hit the following error: ``` # .---command stderr------------ # \| JIT session error: Symbols not found: [ __mlir_ciface_printMemrefI32 ] # \| Error: Failed to materialize symbols: { (main, { __mlir_printMemrefI32, ... }) } # `----------------------------- ``` https://github.com/google/heir/issues/1521#issuecomment-2751303404 confirms the issue is fixed by using `alwayslink` on these two targets, and I confirmed on a separate Apple M1 (OSX version Sequoia 15.3.2.). I'm not an expert on the mlir runner internals, but given the mlir-runner is purely for testing, and alwayslink at worst adds some overhead by not removing symbols, it seems low risk.	2025-04-01 08:58:32 -07:00
Doeke Wartena	a03fce4e20	Update README.md - fixed invalid json in example (#133890 ) A period (`,`) is required or you get an error.	2025-04-01 08:49:36 -07:00
Aaron Ballman	66b540d861	[C11] Claim conformance to WG14 N1518 (#133749 ) This paper introduced ranges of valid start and continuation characters for identifiers. C23 made further changes to these sets.	2025-04-01 11:45:56 -04:00
Slava Zakharin	58551faaf1	[flang] Inline fir.is_contiguous_box in some cases. (#133812 ) Added inlining for `rank == 1` and `innermost` cases.	2025-04-01 08:41:11 -07:00
lntue	65ad6267e8	[libc] Fix atan2f128 test for aarch64. (#133924 )	2025-04-01 11:37:57 -04:00
Rahul Joshi	a8a33bab69	[NFC][SPIRV] Misc code cleanup in SPIRV Target (#133764 ) - Use static instead of anonymous namespace for file local functions. - Enclose file-local classes in anonymous namespace. - Eliminate `llvm::` qualifier when file has `using namespace llvm`. - Eliminate namespace surrounding entire code in SPIRVConvergenceRegionAnalysis.cpp file. - Eliminate call to `initializeSPIRVStructurizerPass` from the pass constructor (https://github.com/llvm/llvm-project/issues/111767)	2025-04-01 08:35:06 -07:00
David Green	4cb41d136c	[AArch64] Prefer zip over ushll for anyext. (#133433 ) Many CPUs have a higher throughput of ZIP instructions vs USHLL. This adds some tablegen patterns for preferring zip in anyext patterns.	2025-04-01 16:24:54 +01:00
Matt Arsenault	ac55688482	llvm-reduce: Add test for token handling in operands-skip (#133857 ) Seems to work correctly but wasn't tested.	2025-04-01 22:17:44 +07:00
Matt Arsenault	664e847916	llvm-reduce: Fix invalid reduction on tokens in operands-to-args (#133855 )	2025-04-01 22:14:47 +07:00
Zahira Ammarguellat	aa73124e51	Fix complex long double division with -mno-x87. (#133152 ) The combination of `-fcomplex-arithmetic=promoted` and `mno-x87` for `double` complex division is leading to a crash. See https://godbolt.org/z/189G957oY This patch fixes that.	2025-04-01 11:10:51 -04:00
Slava Zakharin	1ab3a4f234	[flang-rt][NFC] Work around CTK12.8 compilation failure. (#133833 ) It happened in https://lab.llvm.org/buildbot/#/builders/152/builds/1131 when the buildbot was switched from CTK12.3 to CTK12.8. The logs are gone by now, so the above link is useless. The error was: error: ‘auto’ not permitted in template argument This workaround helps, but I also reported the issue to NVCC devs.	2025-04-01 08:04:45 -07:00
Craig Topper	4e6c48f1e7	[RISCV] Merge RegStart with RegEnd in parseRegListCommon. NFC (#133867 ) We only need to keep track of the last register seen. We never need the first register once we've parsed. Currently if s0/x8 is used RegStart will point to that and not ra/s1 so it already isn't the start.	2025-04-01 08:04:32 -07:00
Craig Topper	19fb4b04a6	[RISCV] Validate the end of register ranges in Zcmp register lists. (#133866 ) We were only checking that the last register was a register, not that it was a legal register for a register list. This caused the encoder function to hit an llvm_unreachable. The error messages are not good, but this only one of multiple things that need to be fixed in this function. I'll focus on error messages later once I have the other issues fixed.	2025-04-01 08:04:05 -07:00
lntue	8741412bdf	[libc][math] Implement a fast pass for atan2f128 with 1ULP error using DyadicFloat<128>. (#133150 ) Part of https://github.com/llvm/llvm-project/issues/131642.	2025-04-01 10:57:32 -04:00
Simon Pilgrim	664745cf38	[X86] avx512-vselect.ll - regenerate VPTERNLOG comments	2025-04-01 15:50:07 +01:00
Kazu Hirata	c30776ab9a	[AArch64] Use ArrayRef::slice (NFC) (#133862 )	2025-04-01 07:28:18 -07:00
Kazu Hirata	173eb32b75	[X86] Construct SmallVector with ArrayRef (NFC) (#133860 )	2025-04-01 07:27:23 -07:00
Virginia Cangelosi	e92ff64bad	[Clang][LLVM] Implement single-multi vectors MOP4{A/S} (#128854 ) Implement all single-multi {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the acle in https://github.com/ARM-software/acle/pull/381/files. This PR depends on https://github.com/llvm/llvm-project/pull/127797 This patch updates the semantics of template arguments in intrinsic names for clarity and ease of use. Previously, template argument numbers indicated which character in the prototype string determined the final type suffix, which was confusing—especially for intrinsics using multiple prototype modifiers per operand (e.g., intrinsics operating on arrays of vectors). The number had to reference the correct character in the prototype (e.g., the ‘u’ in “2.u”), making the system cumbersome and error-prone. With this patch, template argument numbers now refer to the operand number that determines the final type suffix, providing a more intuitive and consistent approach.	2025-04-01 15:05:30 +01:00
Fraser Cormack	c1efd8b663	[libclc][NFC] Delete two unused headers These should have been deleted when the respective builtins were moved to the CLC library.	2025-04-01 14:54:50 +01:00
Jean-Didier PAILLEUX	15cfe4a774	[MLIR] Adding 'no_inline' and 'always_inline' attributes on LLMV::CallOp (#133726 ) Addition of `no_inline` and `always_inline` attributes for CallOps in MLIR in order to be able to inline or not directly the call of a function without having the attribute on the `FuncOp`. The addition of these attributes will be used in a future PR in Flang (`[NO]INLINE` directive).	2025-04-01 15:48:25 +02:00
Jean-Didier PAILLEUX	513a91a5f1	[flang/flang-rt] Implement PERROR intrinsic form GNU Extension (#132406 ) Add the implementation of the `PERROR(STRING) ` intrinsic from the GNU Extension to prints on the stderr a newline-terminated error message corresponding to the last system error prefixed by `STRING`. (https://gcc.gnu.org/onlinedocs/gfortran/PERROR.html)	2025-04-01 15:47:54 +02:00
Fraser Cormack	bcf0f8d8aa	[libclc] Move exp10 to the CLC library (#133899 ) The builtin was already nominally in the CLC library; this commit just moves it over. It also vectorizes the builtin on its way.	2025-04-01 14:39:17 +01:00
Jeremy Morse	1ebc308bba	[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855 ) During the transition from debug intrinsics to debug records, we used several different command line options to customise handling: the printing of debug records to bitcode and textual could be independent of how the debug-info was represented inside a module, whether the autoupgrader ran could be customised. This was all valuable during development, but now that totally removing debug intrinsics is coming up, this patch removes those options in favour of a single flag (experimental-debuginfo-iterators), which enables autoupgrade, in-memory debug records, and debug record printing to bitcode and textual IR. We need to do this ahead of removing the experimental-debuginfo-iterators flag, to reduce the amount of test-juggling that happens at that time. There are quite a number of weird test behaviours related to this -- some of which I simply delete in this commit. Things like print-non-instruction-debug-info.ll , the test suite now checks for debug records in all tests, and we don't want to check we can print as intrinsics. Or the update_test_checks tests -- these are duplicated with write-experimental-debuginfo=false to ensure file writing for intrinsics is correct, but that's something we're imminently going to delete. A short survey of curious test changes: * free-intrinsics.ll: we don't need to test that debug-info is a zero cost intrinsic, because we won't be using intrinsics in the future. * undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory mode while we sorted something out; it works now either way. * salvage-cast-debug-info.ll: was testing intrinsics-in-memory get salvaged, isn't necessary now * localize-constexpr-debuginfo.ll: was producing "dead metadata" intrinsics for optimised-out variable values, dbg-records takes the (correct) representation of poison/undef as an operand. Looks like we didn't update this in the past to avoid spurious test differences. * Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing that debug-info affected codegen, and we deferred updating the tests until now. This is just one of those silent gnochange issues that get fixed by RemoveDIs. Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc, that checks we can autoupgrade debug intrinsics that are in bitcode into the new debug records.	2025-04-01 14:27:11 +01:00
Samuel Tebbs	a1e041b646	[NFC][AArch64] Pre-commit high register pressure dot product test	2025-04-01 14:13:30 +01:00
Ramkumar Ramachandra	3a66760d9b	[LV] Improve a test, regen with UTC (#130092 )	2025-04-01 14:11:20 +01:00
Simon Pilgrim	2c0b888359	[X86] combineX86ShuffleChain - prefer combining to X86ISD::SHUF128 if PERMQ operands are splittable (#133900 ) If the 512-bit unary shuffle is a concatenation of 128/256-bit subvectors then we're better off using a X86ISD::SHUF128 node so we can fold the concatenation into the shuffle as well.	2025-04-01 13:47:52 +01:00
Pablo Antonio Martinez	a338f80ddc	[mlir][Linalg] Add transform to convert linalg.copy into memref.copy (#132422 ) Targeted rewrite of a linalg.copy on memrefs to a memref.copy. This is useful when bufferizing copies to a linalg.copy, applying some transformations, and then rewriting the copy into a memref.copy. If the element types of the source and destination differ, or if the source is a scalar, the transform produces a silenceable failure.	2025-04-01 13:39:33 +01:00
Virginia Cangelosi	6892d54286	[Clang][LLVM] Implement single-single vectors MOP4{A/S} (#127797 ) Implement all single-single {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the acle in https://github.com/ARM-software/acle/pull/381/files	2025-04-01 13:35:09 +01:00
Anutosh Bhat	8f56394487	[clang-repl] Implement LoadDynamicLibrary for clang-repl wasm use cases (#133037 ) Currently we don't make use of the JIT for the wasm use cases so the approach using the execution engine won't work in these cases. Rather if we use dlopen. We should be able to do the following (demonstrating through a toy project) 1) Make use of LoadDynamicLibrary through the given implementation ``` extern "C" EMSCRIPTEN_KEEPALIVE int load_library(const char *name) { auto Err = Interp->LoadDynamicLibrary(name); if (Err) { llvm::logAllUnhandledErrors(std::move(Err), llvm::errs(), "load_library error: "); return -1; } return 0; } ``` 2) Add a button to call load_library once the library has been added in our MEMFS (currently we have symengine built as a SIDE MODULE and we are loading it)	2025-04-01 15:33:45 +03:00
Paul Walker	c192737009	[LLVM][InstCombine][AArch64] Refactor common SVE intrinsic combines. (#126928 ) Introduce SVEIntrinsicInfo to store properties common across SVE intrinsics. This allows a seperation between intrinsic IDs and the transformations that can be applied to them, which reduces the layering problems we hit when adding new combines. This PR is mostly refactoring to bring in the concept and port the most common combines (e.g. dead code when all false). This will be followed up with new combines where I plan to reuse much of the existing instruction simplifcation logic to significantly improve our ability to constant fold SVE intrinsics.	2025-04-01 13:27:46 +01:00

... 5 6 7 8 9 ...

532960 Commits