llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-23 22:26:04 +00:00

Author	SHA1	Message	Date
Alex MacLean	3c7a0e6c82	[NVPTX] Cleanup and refactor atomic lowering (#133781 ) Cleanup lowering of atomic instructions and intrninsics. The TableGen changes are primarily a refactor, though sub variants are now lowered via operation legalization, potentially allowing for more DAG optimization.	2025-04-01 13:08:57 -07:00
Andy Kaylor	9f3d8e8fb8	[CIR] Upstream support for while and do..while loops (#133157 ) This adds basic support for while and do..while loops. Support for break and continue are left for a subsequent patch.	2025-04-01 13:03:24 -07:00
Aiden Grossman	ce296f1eba	[CI] Exclude gn changes from running premerge (#133623 ) These changes are mostly pushed by the gnsyncbot directly to main and thus don't go through a PR, but we still test on main to see if main is broken. Given these touch llvm/, they end up burning a decent amount of testing time for no real benefit, so I think it makes sense to exclude them from premerge testing explicitly.	2025-04-01 12:58:16 -07:00
David Peixotto	782e0cef76	[lldb] Fix intel trace plugin tests (#133826 ) The tests for the [intel-pt](`3483740289/lldb/docs/use/intel_pt.rst`) trace plugin were failing for multiple reasons. On machines where tracing is supported many of the tests were crashing because of a nullptr dereference. It looks like the `core_file` parameter in `ProcessTrace::CreateInstance` was once ignored, but was changed to always being dereferenced. This caused the tests to fail even when tracing was supported. On machines where tracing is not supported we would still run tests that attempt to take a trace. These would obviously fail because the required hardware is not present. Note that some of the tests simply read serialized json as trace files which does not require any special hardware. This PR fixes these two issues by guarding the pointer dereference and then skipping unsupported tests on machines. With these changes the trace tests pass on both types of machines. We also add a new unit test to validate that a process can be created with a nullptr core_file through the generic process trace plugin path.	2025-04-01 12:55:41 -07:00
Qiongsi Wu	4a73c99329	[clang][Modules] Fix the Size of `RecordDecl`'s `BitCodeAbbrevOp` (#133500 ) https://github.com/llvm/llvm-project/pull/102040/files#diff-125f472e690aa3d973bc42aa3c5d580226c5c47661551aca2889f960681aa64dR2477 added 1 bit to `RecordDecl`'s serialization format, but did not increment its abbreviation size. This can lead to rare cases where a record may overflow if the `RecordDecl`'s `getArgPassingRestrictions()` returns something bigger than 1 (see [here](`b3f01a6aa4/clang/lib/Serialization/ASTWriterDecl.cpp (L688)`)). rdar://143763558	2025-04-01 12:55:17 -07:00
Aiden Grossman	23fb048ce3	[CI] Fix Monolithic Linux Build in Ubuntu 24.04 (#133628 ) This patch fixes the monolithic linux build in Ubuntu 24.04. Newer versions of debian/ubuntu pass a warning when installing packages at the system level using pip as it interferes with system package manager installed python packages. We do not use any system package manager installed python packages, so we just ignore the warning (that is an error without passing the flag) by passing the --break-system-packages flag.	2025-04-01 12:55:07 -07:00
Mark Danial	ac0649a75a	[OpenMP] [AIX] Add missing } in openmp/runtime/src/z_Linux_util.cpp (#133973 ) Changes from https://github.com/llvm/llvm-project/pull/133034 removed a `}` presumably accidentally that are causing failures in the AIX flang bot.	2025-04-01 15:47:19 -04:00
Max191	1407f5bee9	[mlir] Canonicalize extract_slice(unpack) (#133777 ) Canonicalizes a chain of `linalg.unpack -> tensor.extract_slice` into a `linalg.unpack` with reduced dest sizes. This will only happen when the unpack op's only user is a non rank-reducing slice with zero offset and unit strides. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com> Signed-off-by: Max Dawkins <maxdawkins19@gmail.com> Co-authored-by: Max Dawkins <maxdawkins19@gmail.com>	2025-04-01 14:51:58 -04:00
Alexey Bataev	0e3049c562	[SLP]Support revectorization of the previously vectorized scalars If the scalar instructions is marked for the vectorization in the tree, it cannot be vectorized as part of the another node in the same tree, in general. It may prevent some potentially profitable vectorization opportunities, since some nodes end up being buildvector/gather nodes, which add to the total cost. Patch allows revectorization of the previously vectorized scalars. Reviewers: hiraditya, RKSimon Reviewed By: RKSimon, hiraditya Pull Request: https://github.com/llvm/llvm-project/pull/133091	2025-04-01 14:30:06 -04:00
Arvind Sudarsanam	2b064108ed	Fix a build error (#133957 ) This fixes error reported in post-commit testing of https://github.com/llvm/llvm-project/pull/133797 LOG: https://lab.llvm.org/buildbot/#/builders/140/builds/20266 Thanks Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>	2025-04-01 18:24:54 +00:00
Virginia Cangelosi	79487757b7	[Clang][LLVM] Implement multi-multi vectors MOP4{A/S} (#129230 ) Implement all multi-multi {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the acle in https://github.com/ARM-software/acle/pull/381/files	2025-04-01 19:20:27 +01:00
Heejin Ahn	4d1c827423	[WebAssembly] Support parsing .lto_set_conditional (#126546 ) In the split-LTO-unit mode in ThinLTO, a compilation module is split into two and global variables that meet a specific criteria is moved to the split module. `d21fc58aee/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (L315-L366)` And if there is an originally local-linkage global value defined in the original module and referenced in the split module or the vice versa, that value is _promoted_ by attaching a module ID to their names in order to prevent name clashes because now they can be referenced from other modules. `d21fc58aee/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (L46-L100)` And when that promoted global value is a function, a `.lto_set_conditional` entry is written to the original module to avoid breaking references from inline assembly: `d21fc58aee/llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp (L84-L91)` The syntax of this is, if the original function name is `symbolA` and the module ID is `123`, ```ll module asm ".lto_set_conditional symbolA,symbolA.123" ``` These symbols are parsed here: `648981f913/llvm/lib/MC/MCParser/AsmParser.cpp (L6467)` The first function symbol in this `.lto_set_conditional` do not exist as a function in the bitcode anymore because it was renamed to the second. So they are not assigned as function symbols but they are not really data either, so the object writer crashes here: `5b9e6c7993/llvm/lib/MC/WasmObjectWriter.cpp (L1820)` This PR makes the object writer just skip those symbols. --- This problem was discovered when I was testing with `-fwhole-program-vtables`. The reason we didn't have this problem before with ThinLTO was because `-fsplit-lto-unit`, which splits LTO units when possible, defaults to false, but it defaults to true when `-fwhole-program-vtables` is used.	2025-04-02 03:15:29 +09:00
Craig Topper	bd7585bea3	[RISCV] Improve error for using x18-x27 in a register list with RVE. (#133936 ) matchRegisterNameHelper returns MCRegister() for RVE so the first RVE check was dead. For the second check, I've moved the RVE check from the comma parsing to the identifier parsing so the diagnostic points at the register. Note we're using matchRegisterName instead of matchRegisterNameHelper to avoid allowing ABI names so we don't get the RVE check that lives inside matchRegisterNameHelper. The errors for RVE in general should probably say something other than "invalid register", but that's a problem throughout the assembler.	2025-04-01 11:14:25 -07:00
Valentin Clement (バレンタインクレメン)	afa32d3e0e	[flang][cuda] Fix char argument This would fail with `error: argument of type "char" is incompatible with parameter of type "const char *"`	2025-04-01 11:00:50 -07:00
Sam Clegg	a30caa6a73	[WebAssembly] Add missing tests from #133289 (#133938 )	2025-04-01 10:47:35 -07:00
David Green	7d91c4f3eb	[ARM] Use tablegen HasOneUse. NFC	2025-04-01 18:41:21 +01:00
David Green	d8bf0398e5	[AArch64] Use tablegen HasOneUse. NFC	2025-04-01 18:37:10 +01:00
Kazu Hirata	9586117c3a	[clang-sycl-linker] Fix a warning This patch fixes: clang/tools/clang-sycl-linker/ClangSYCLLinker.cpp:127:13: error: function 'getMainExecutable' is not needed and will not be emitted [-Werror,-Wunneeded-internal-declaration]	2025-04-01 10:35:50 -07:00
lntue	44b87e4206	[libc] Reduce the range of hypotf exhaustive test to be run automatically. (#133944 ) The current setup of `hypotf` exhaustive tests might take days to finish.	2025-04-01 13:29:28 -04:00
Fraser Cormack	f14ff59da7	[libclc] Move exp, exp2 and expm1 to the CLC library (#133932 ) These all share the use of a common helper function so are handled in one go. These builtins are also now vectorized.	2025-04-01 18:15:37 +01:00
Matt Arsenault	602d05fbe8	llvm-reduce: Make myself maintainer (#133919 )	2025-04-02 00:11:46 +07:00
Arvind Sudarsanam	7003f7d23a	[clang-sycl-linker] Replace llvm-link with API calls (#133797 ) This PR has the following changes: Replace llvm-link with calls to linkInModule to link device files Add -print-linked-module option to dump linked module for testing Added a test to verify that linking is working as expected. We will eventually move to using thin LTO for linking device inputs. Thanks --------- Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>	2025-04-01 17:09:45 +00:00
Valentin Clement (バレンタインクレメン)	01889de8e9	[flang][device] Enable Stop functions on device build (#133803 ) Update `StopStatement` and `StopStatementText` to be build for the device.	2025-04-01 10:06:45 -07:00
Paul Bowen-Huggett	75242a8a1d	[RISCV] Fix the c_slli disassembler test (NFC) (#133921 ) This change fixes the exhaustive text for the c.slli instruction so that each hex pattern now appears only once. The problem was spotted [here](https://github.com/llvm/llvm-project/pull/133713#discussion_r2021577609) by @topperc (for which, thank you).	2025-04-01 10:05:30 -07:00
Matt Arsenault	f60eed9344	llvm-reduce: Add target-features-attr reduction (#133887 ) Try to reduce individual subtarget features in the "target-features" attribute. This attempts a textual removal of the fields in the string, not a semantic removal. Typically there's a lot of redundant feature spam in the feature list implied by the target-cpu (which I really wish clang would stop emitting). If we could parse these out, we could easily drop the fields without testing anything.	2025-04-02 00:03:43 +07:00
Krzysztof Drewniak	25622aa745	[mlir][AMDGPU] Add gfx950 MFMAs to the amdgpu.mfma op (#133553 ) This commit extends the lowering of amdgpu.mfma to handle the new double-rate MFMAs in gfx950 and adds tests for these operations. It also adds support for MFMAs on small floats (f6 and f4), which are implented using the "scaled" MFMA intrinsic with a scale value of 0 in order to have an unscaled MFMA. This commit does not add a `amdgpu.scaled_mfma` operation, as that is future work. --------- Co-authored-by: Jakub Kuderski <kubakuderski@gmail.com>	2025-04-01 11:59:09 -05:00
Nirvedh Meshram	69c5049826	[NFC][mlir] Update generate script for conv_3d_ncdhw_fcdhw (#133927 ) https://github.com/llvm/llvm-project/pull/129547 changed the IR directly without updating the auto generate script. Signed-off-by: Nirvedh <nirvedh@gmail.com>	2025-04-01 11:55:40 -05:00
Matt Arsenault	5c4302442b	llvm-reduce: Reduce global variable code model (#133865 ) The current API doesn't have a way to unset it. The query returns an optional, but the set doesn't. Alternatively I could switch the set to also use optional.	2025-04-01 23:54:10 +07:00
jimingham	347c5a7af5	Add a new affordance that the Python module in a dSYM (#133290 ) So the dSYM can be told what target it has been loaded into. When lldb is loading modules, while creating a target, it will run "command script import" on any Python modules in Resources/Python in the dSYM. However, this happens WHILE the target is being created, so it is not yet in the target list. That means that these scripts can't act on the target that they a part of when they get loaded. This patch adds a new python API that lldb will call: __lldb_module_added_to_target if it is defined in the module, passing in the Target the module was being added to, so that code in these dSYM's don't have to guess.	2025-04-01 09:54:06 -07:00
Matt Arsenault	ec290a43f6	llvm-reduce: Reduce externally_initialized (#133859 ) Not sure this is the right place to put it. This is a property of GlobalVariable, not GlobalValue. But the ReduceGlobalVars reduction tries to delete the value entirely.	2025-04-01 23:51:45 +07:00
Brox Chen	dd1d41f833	[AMDGPU][True16][CodeGen] fix moveToVALU with proper subreg access in true16 (#132089 ) There are V2S copies between vpgr16 and spgr32 in true16 mode. This is caused by vgpr16 and sgpr32 both selectable by 16bit src in ISel. When a V2S copy and its useMI are lowered to VALU, this patch check 1. If the generated new VALU is used by a true16 inst. Add subreg access if necessary. 2. Legalize the V2S copy by replacing it to subreg_to_reg an example MIR looks like: ``` %2:sgpr_32 = COPY %1:vgpr_16 %3:sgpr_32 = S_OR_B32 %2:sgpr_32, ... %4:vgpr_16 = V_ADD_F16_t16 %3:sgpr_32, ... ``` currently lowered to ``` %2:vgpr_32 = COPY %1:vgpr_16 %3:vgpr_32 = V_OR_B32 %2:vgpr_32, ... %4:vgpr_16 = V_ADD_F16_t16 %3:vgpr_32, ... ``` after this patch ``` %2:vgpr_32 = SUBREG_TO_REG 0, %1:vgpr_16, lo16 %3:vgpr_32 = V_OR_B32 %2:vgpr_32, ... %4:vgpr_16 = V_ADD_F16_t16 %3.lo16:vgpr_32, ... ```	2025-04-01 12:40:18 -04:00
Petr Hosek	4b19db6db9	Revert "AsmPrinter: Remove ELF's special lowerRelativeReference for unnamed_addr function" (#133935 ) Reverts llvm/llvm-project#132684	2025-04-01 09:39:07 -07:00
Fraser Cormack	00e6d4fe06	[libclc][NFC] Delete three unused .inc files	2025-04-01 17:36:01 +01:00
Ivan Butygin	1f194ff34e	[mlir] Expose `simplifyAffineExpr` through python api (#133926 )	2025-04-01 19:28:53 +03:00
Matt Arsenault	7e25b24073	IRNormalizer: Replace cl::opts with pass parameters (#133874 ) Not sure why the "fold-all" option naming didn't match the variable "FoldPreOutputs", but I've preserved the difference. More annoyingly, the pass name "normalize" does not match the pass name IRNormalizer and should probably be fixed one way or the other. Also the existing test coverage for the flags is lacking. I've added a test that shows they parse, but we should have tests that they do something.	2025-04-01 23:27:20 +07:00
lorenzo chelini	105c8c38dc	[MLIR][NFC] Retire let constructor for EmitC (#133732 ) `let constructor` is legacy (do not use in tree!) since the tableGen backend emits most of the glue logic to build a pass.	2025-04-01 18:22:40 +02:00
Jonathan Thackray	558ce50ebc	[Clang][LLVM] Implement multi-single vectors MOP4{A/S} (#129226 ) Implement all multi-single {BF/F/S/U/SU/US}MOP4{A/S} instructions in clang and llvm following the ACLE in https://github.com/ARM-software/acle/pull/381/files	2025-04-01 17:04:59 +01:00
Arthur Eubanks	e8711436b3	[SCEV] Remove EqCacheSCEV (#133186 ) This was added in https://reviews.llvm.org/D26389 to help with extremely deep SCEV expressions. However, this is wrong since we may cache sub-SCEVs to be equivalent that CompareValueComplexity returned 0 due to hitting the max comparison depth. This also improves compile time in some compiles: https://llvm-compile-time-tracker.com/compare.php?from=34fa037c4fd7f38faada5beedc63ad234e904247&to=e241ecf999f4dd42d4b951d4a5d4f8eabeafcff0&stat=instructions:u Similar to #100721. Fixes #130688.	2025-04-01 09:02:33 -07:00
Jeremy Kun	179062b2dc	[mlir][bazel] add alwayslink=True to mlir-runner utils (#133787 ) MacOS platforms using mlir-runner in lit tests consistently hit the following error: ``` # .---command stderr------------ # \| JIT session error: Symbols not found: [ __mlir_ciface_printMemrefI32 ] # \| Error: Failed to materialize symbols: { (main, { __mlir_printMemrefI32, ... }) } # `----------------------------- ``` https://github.com/google/heir/issues/1521#issuecomment-2751303404 confirms the issue is fixed by using `alwayslink` on these two targets, and I confirmed on a separate Apple M1 (OSX version Sequoia 15.3.2.). I'm not an expert on the mlir runner internals, but given the mlir-runner is purely for testing, and alwayslink at worst adds some overhead by not removing symbols, it seems low risk.	2025-04-01 08:58:32 -07:00
Doeke Wartena	a03fce4e20	Update README.md - fixed invalid json in example (#133890 ) A period (`,`) is required or you get an error.	2025-04-01 08:49:36 -07:00
Aaron Ballman	66b540d861	[C11] Claim conformance to WG14 N1518 (#133749 ) This paper introduced ranges of valid start and continuation characters for identifiers. C23 made further changes to these sets.	2025-04-01 11:45:56 -04:00
Slava Zakharin	58551faaf1	[flang] Inline fir.is_contiguous_box in some cases. (#133812 ) Added inlining for `rank == 1` and `innermost` cases.	2025-04-01 08:41:11 -07:00
lntue	65ad6267e8	[libc] Fix atan2f128 test for aarch64. (#133924 )	2025-04-01 11:37:57 -04:00
Rahul Joshi	a8a33bab69	[NFC][SPIRV] Misc code cleanup in SPIRV Target (#133764 ) - Use static instead of anonymous namespace for file local functions. - Enclose file-local classes in anonymous namespace. - Eliminate `llvm::` qualifier when file has `using namespace llvm`. - Eliminate namespace surrounding entire code in SPIRVConvergenceRegionAnalysis.cpp file. - Eliminate call to `initializeSPIRVStructurizerPass` from the pass constructor (https://github.com/llvm/llvm-project/issues/111767)	2025-04-01 08:35:06 -07:00
David Green	4cb41d136c	[AArch64] Prefer zip over ushll for anyext. (#133433 ) Many CPUs have a higher throughput of ZIP instructions vs USHLL. This adds some tablegen patterns for preferring zip in anyext patterns.	2025-04-01 16:24:54 +01:00
Matt Arsenault	ac55688482	llvm-reduce: Add test for token handling in operands-skip (#133857 ) Seems to work correctly but wasn't tested.	2025-04-01 22:17:44 +07:00
Matt Arsenault	664e847916	llvm-reduce: Fix invalid reduction on tokens in operands-to-args (#133855 )	2025-04-01 22:14:47 +07:00
Zahira Ammarguellat	aa73124e51	Fix complex long double division with -mno-x87. (#133152 ) The combination of `-fcomplex-arithmetic=promoted` and `mno-x87` for `double` complex division is leading to a crash. See https://godbolt.org/z/189G957oY This patch fixes that.	2025-04-01 11:10:51 -04:00
Slava Zakharin	1ab3a4f234	[flang-rt][NFC] Work around CTK12.8 compilation failure. (#133833 ) It happened in https://lab.llvm.org/buildbot/#/builders/152/builds/1131 when the buildbot was switched from CTK12.3 to CTK12.8. The logs are gone by now, so the above link is useless. The error was: error: ‘auto’ not permitted in template argument This workaround helps, but I also reported the issue to NVCC devs.	2025-04-01 08:04:45 -07:00
Craig Topper	4e6c48f1e7	[RISCV] Merge RegStart with RegEnd in parseRegListCommon. NFC (#133867 ) We only need to keep track of the last register seen. We never need the first register once we've parsed. Currently if s0/x8 is used RegStart will point to that and not ra/s1 so it already isn't the start.	2025-04-01 08:04:32 -07:00

... 2 3 4 5 6 ...

532828 Commits