llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 18:56:36 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	52f3cad9ff	[X86] getFauxShuffleMask - move INSERT_SUBVECTOR(SRC0, EXTRACT_SUBVECTOR(SRC1)) matching behind common one use bitcast checks (#134227 ) No need to ignore one use checks for the INSERT_SUBVECTOR(SRC0, EXTRACT_SUBVECTOR(SRC1)) fold Noticed while working on the #133947 regressions	2025-04-03 13:17:14 +01:00
Paul Walker	41a6bb4c05	[LLVM][CodeGen][SVE] Prefer NEON instructions when zeroing Z registers. (#133929 ) Several implementations have zero-latency instructions to zero registers. To-date no implementation has a dedicated SVE instruction but we can use the NEON equivalent because it is defined to zero bits 128..VL regardless of the immediate used. NOTE: The relevant instruction is not available in streaming mode, where the original SVE DUP instruction remains in use.	2025-04-03 13:15:05 +01:00
Ilya Biryukov	722346c7bc	[Tooling] Handle AttributedType in getFullyQualifiedType (#134228 ) Before this change the code used to add extra qualifiers, e.g. `std::unique_ptr<int> _Nonnull` became `::std::std::unique_ptr<int> _Nonnull` when adding a global namespace qualifier was requested.	2025-04-03 14:14:34 +02:00
Michael Buch	739fe98080	[lldb][test] TestExprFromNonZeroFrame.py: fix windows build On Windows this test was failing to link with following error: ``` make: Entering directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/commands/expression/expr-from-non-zero-frame/TestExprFromNonZeroFrame.test' C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe -gdwarf -O0 -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\commands\expression\expr-from-non-zero-frame -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info -MT main.o -MD -MP -MF main.d -c -o main.o C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\commands\expression\expr-from-non-zero-frame/main.c C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\bin\clang.exe main.o -gdwarf -O0 -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/../../../../..//include -IC:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/tools/lldb/include -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\test\API\commands\expression\expr-from-non-zero-frame -IC:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make -include C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\llvm-project\lldb\packages\Python\lldbsuite\test\make/test_common.h -fno-limit-debug-info -fuse-ld=lld --driver-mode=g++ -o "a.out" lld-link: error: undefined symbol: printf >>> referenced by main.o:(func) clang: error: linker command failed with exit code 1 (use -v to see invocation) make: *** [Makefile.rules:530: a.out] Error 1 make: Leaving directory 'C:/Users/tcwg/llvm-worker/lldb-aarch64-windows/build/lldb-test-build.noindex/commands/expression/expr-from-non-zero-frame/TestExprFromNonZeroFrame.test' ```	2025-04-03 12:21:46 +01:00
Aaron Ballman	7febd78f1e	No longer diagnose __auto_type as the auto extension (#134129 ) Given: __auto_type x = 12; decltype(auto) y = 12; -Wc++98-compat would diagnose both x and y with: 'auto' type specifier is incompatible with C++98 This patch silences the diagnostic in those cases. decltype(auto) is still diagnosed with: 'decltype(auto)' type specifier is incompatible with C++ standards before C++14 as expected but no longer produces the extraneous diagnostic about use of 'auto'. Fixes #47900	2025-04-03 07:13:30 -04:00
Paul Walker	ee4e8197fa	[LLVM][AArch64][SVE] Mark DUP immediate instructions with isAsCheapAsAMove. (#133945 ) Doing this means we'll regenerate an immediate rather than copy the result of an existing one, reducing instruction dependency chains.	2025-04-03 11:42:07 +01:00
Simi Pallipurath	cb0d1305d1	[Clang][ARM] Ensure both -mno-unaligned-access and -munaligned-access are passed to multilib selection logic (#134099 ) Previously, alignment option was passed to multilib selection logic only when -mno-unaligned-access was explicitly specified on the command line. Now this change ensure both -mno-unaligned-access and -munaligned-access are passed to the multilib selection logic, which now also considers the target architecture when determining alignment access policy.	2025-04-03 11:16:05 +01:00
David Green	6c27817294	[SelectionDAG] Use SimplifyDemandedBits from SimplifyDemandedVectorElts Bitcast. (#133717 ) This adds a call to SimplifyDemandedBits from bitcasts with scalar input types in SimplifyDemandedVectorElts, which can help simplify the input scalar.	2025-04-03 11:14:08 +01:00
Michael Buch	554f4d1a57	[lldb][Target] RunThreadPlan to save/restore the ExecutionContext's frame if one exists (#134097 ) When using `SBFrame::EvaluateExpression` on a frame that's not the currently selected frame, we would sometimes run into errors such as: ``` error: error: The context has changed before we could JIT the expression! error: errored out in DoExecute, couldn't PrepareToExecuteJITExpression ``` During expression parsing, we call `RunStaticInitializers`. On our internal fork this happens quite frequently because any usage of, e.g., function pointers, will inject ptrauth fixup code into the expression. The static initializers are run using `RunThreadPlan`. The `ExecutionContext::m_frame_sp` going into the `RunThreadPlan` is the `SBFrame` that we called `EvaluateExpression` on. LLDB then tries to save this frame to restore it after the thread-plan ran (the restore occurs by unconditionally overwriting whatever is in `ExecutionContext::m_frame_sp`). However, if the `selected_frame_sp` is not the same as the `SBFrame`, then `RunThreadPlan` would set the `ExecutionContext`'s frame to a different frame than what we started with. When we `PrepareToExecuteJITExpression`, LLDB checks whether the `ExecutionContext` frame changed from when we initially `EvaluateExpression`, and if did, bails out with the error above. One such test-case is attached. This currently passes regardless of the fix because our ptrauth static initializers code isn't upstream yet. But the plan is to upstream it soon. This patch addresses the issue by saving/restoring the frame of the incoming `ExecutionContext`, if such frame exists. Otherwise, fall back to using the selected frame. rdar://147456589	2025-04-03 11:10:16 +01:00
Yingwei Zheng	61907ebd76	[Clang][CodeGen] Do not use the GEP result to infer offset and result type (#134221 ) If `CreateConstInBoundsGEP2_32` returns a constant null/gep, the cast to GetElementPtrInst will fail. This patch uses two static helpers `GEPOperator::accumulateConstantOffset/GetElementPtrInst::getIndexedType` to infer offset and result type instead of depending on the GEP result. This patch is extracted from https://github.com/llvm/llvm-project/pull/130734.	2025-04-03 18:03:42 +08:00
Camsyn	ecc35456d7	[Utils] Fix incorrect LCSSA PHI nodes when splitting critical edges with MergeIdenticalEdges (#131744 ) This PR fixes incorrect LCSSA PHI node generation when splitting critical edges with both `PreserveLCSSA` and `MergeIdenticalEdges` enabled. The bug caused PHI nodes in the split block to miss predecessors when multiple identical edges were merged.	2025-04-03 12:02:03 +02:00
Simon Pilgrim	bf516098fb	[X86] SimplifyDemandedVectorEltsForTargetNode - reduce the size of VPERMV/VPERMV3 nodes if the upper elements are not demanded (#133923 ) With AVX512VL targets, use 128/256-bit VPERMV/VPERMV3 nodes when we only need the lower elements.	2025-04-03 11:01:08 +01:00
Hsiangkai Wang	2e7ed78cff	[mlir][spirv] Add instruction OpGroupNonUniformRotateKHR (#133428 ) Add an instruction under the extension SPV_KHR_subgroup_rotate. The specification for the extension is here: https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_subgroup_rotate.html	2025-04-03 11:00:29 +01:00
Pavel Labath	662d385c7b	[lldb/telemetry] Report exit status only once (#134078 ) SetExitStatus can be called the second time when we reap the debug server process. This shouldn't be interesting as at that point, we've already told everyone that the process has exited. I believe/hope this will also help with sporadic shutdown crashes that have cropped up recently. They happen because the debug server is monitored from a detached thread, so this code can be called after main returns (and starts destroying everything). This isn't a real fix for that though, as the situation can still happen (it's just that it usually happens after the exit status has already been set). I think the real fix for that is to make sure these threads terminate before we start shutting everything down.	2025-04-03 11:59:02 +02:00
Vladislav Dzhidzhoev	094904303d	Revert "[lldb] Return const UnwindPlan pointers from FuncUnwinders (#133247 )" This reverts commit d7afafdbc464e65c56a0a1d77bad426aa7538306. Caused remote Linux to Linux buildbot failure https://lab.llvm.org/buildbot/#/builders/195/builds/7046.	2025-04-03 11:33:11 +02:00
Jack Frankland	6f324bd39b	[mlir][tosa] Remove Convolution Type Verifiers (#134077 ) Remove the test in the convolution verifier that checks the input and output element types of convolution operations conform to the constraints imposed by the TOSA 1.0 specification. These checks are too strict for users of the TOSA dialect who wish to allow more types than those allowed by the spec and provide compatibility issues with earlier TOSA implementation which allowed more type combinations. Users who do wish to constrain the convolution types combination to only those allowed by the TOSA 1.0 spec should run the TOSA validation pass which already performs these checks. Signed-off-by: Jack Frankland <jack.frankland@arm.com>	2025-04-03 10:30:10 +01:00
Simon Pilgrim	6ec66a2292	[X86] Move VPERMV3(X,M,Y) -> VPERMV(M,CONCAT(X,Y)) fold after general VPERMV3 canonicalization Pulled out of #133923 - this prevents regressions with SimplifyDemandedVectorEltsForTargetNode exposing VPERMV3(X,M,X) repeated operand patterns which were getting concatenated to wider VPERMV nodes before simpler canonicalizations could clean them up.	2025-04-03 10:24:02 +01:00
Romaric Jodin	7baa7edc00	[libclc]: clspv: add a dummy implememtation for mul_hi (#134094 ) clspv uses a better implementation that is not using a bigger side when not available. Add a dummy implementation for mul_hi to avoid to override the implementation of clspv with the one in libclc.	2025-04-03 10:18:39 +01:00
Simon Pilgrim	edc22c64e5	[X86] getFauxShuffleMask - only handle VTRUNC nodes with matching src/dst sizes (#134161 ) Cleanup work for #133947 - we need to handle VTRUNC nodes with large source vectors directly to allow us to widen the size of the shuffle combine We currently discard these results in combineX86ShufflesRecursively anyhow as we don't allow inputs from getTargetShuffleInputs to be larger than the shuffle value type	2025-04-03 09:42:27 +01:00
Carlos Galvez	6333fa5160	[clang-tidy] Fix broken HeaderFilterRegex when read from config file (#133582 ) PR https://github.com/llvm/llvm-project/pull/91400 broke the usage of HeaderFilterRegex via config file, because it is now created at a different point in the execution and leads to a different value. The result of that is that using HeaderFilterRegex only in the config file does NOT work, in other words clang-tidy stops triggering warnings on header files, thereby losing a lot of coverage. This patch reverts the logic so that the header filter is created upon calling the getHeaderFilter() function. Additionally, this patch adds 2 unit tests to prevent regressions in the future: - One of them, "simple", tests the most basic use case with a single top-level .clang-tidy file. - The second one, "inheritance", demonstrates that the subfolder only gets warnings from headers within it, and not from parent headers. Fixes #118009 Fixes #121969 Fixes #133453 Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>	2025-04-03 09:28:34 +02:00
Dmitry Polukhin	e1aaee7ea2	[modules] Handle friend function that was a definition but became only a declaration during AST deserialization (#132214 ) Fix for regression #130917, changes in #111992 were too broad. This change reduces scope of previous fix. Added `ExternalASTSource::wasThisDeclarationADefinition` to detect cases when FunctionDecl lost body due to declaration merges.	2025-04-03 08:27:13 +01:00
Yingwei Zheng	73e1710a4d	[SimplifyCFG] Remove unused variable. NFC. (#134211 )	2025-04-03 15:22:51 +08:00
Juan Manuel Martinez Caamaño	041e84261a	[Clang][AMDGPU] Expose buffer load lds as a clang builtin (#132048 ) CK is using either inline assembly or inline LLVM-IR builtins to generate buffer_load_dword lds instructions. This patch exposes this instruction as a Clang builtin available on gfx9 and gfx10. Related to SWDEV-519702 and SWDEV-518861	2025-04-03 09:22:38 +02:00
Ryotaro Kasuga	91f3965be4	[LoopInterchange] Fix the vectorizable check for a loop (#133667 ) In the profitability check for vectorization, the dependency matrix was not handled correctly. This can result to make a wrong decision: It may say "this loop can be vectorized" when in fact it cannot. The root cause of this is that the check process early returns when it finds '=' or 'I' in the dependency matrix. To make sure that we can actually vectorize the loop, we need to check all the rows of the matrix. This patch fixes the process of checking whether we can vectorize the loop or not. Now it won't make a wrong decision for a loop that cannot be vectorized. Related: #131130	2025-04-03 16:21:19 +09:00
Yingwei Zheng	b6c0ce0bb6	[IR][NFC] Use `SwitchInst::defaultDestUnreachable` (#134199 )	2025-04-03 14:47:47 +08:00
Iris	3295970d84	[ConstantFolding] Add support for `sinh` and `cosh` intrinsics in constant folding (#132671 ) Closes #132503.	2025-04-03 08:34:09 +02:00
Hua Tian	7e65944292	[llvm][CodeGen] avoid repeated interval calculation in window scheduler (#132352 ) Some new registers are reused when replacing some old ones in certain use case of ModuloScheduleExpander. It is necessary to avoid repeated interval calculations for these registers.	2025-04-03 14:25:55 +08:00
Nikita Popov	b384d6d6cc	[CodeGen] Don't include CGDebugInfo.h in CodeGenFunction.h (NFC) (#134100 ) This is an expensive header, only include it where needed. Move some functions out of line to achieve that. This reduces time to build clang by ~0.5% in terms of instructions retired.	2025-04-03 08:04:19 +02:00
Jason Molenda	a19c018379	Revert "[lldb][debugserver] Save and restore the SVE/SME register state (#134184 )" This reverts commit 4e40c7c4bd66d98f529a807dbf410dc46444f4ca. arm64 CI is getting a failure in lldb-api.tools/lldb-server.TestGdbRemoteRegisterState.py with this commit, need to investigate and re-land.	2025-04-02 23:01:51 -07:00
Snehasish Kumar	7f2abe8fd1	Revert "[Metadata] Preserve MD_prof when merging instructions when one is missing." (#134200 ) Reverts llvm/llvm-project#132433 I suspect this change caused a failure in the bolt build bot. https://lab.llvm.org/buildbot/#/builders/113/builds/6621 ``` !9185 = !{!"branch_weights", i32 3912, i32 802} Wrong number of operands !9185 = !{!"branch_weights", i32 3912, i32 802} fatal error: error in backend: Broken module found, compilation aborted! ```	2025-04-02 22:11:17 -07:00
Craig Topper	f404826842	[RISCV] Don't allow '-' after 'ra' in Zcmp/Xqccmp register list. (#134182 ) Move the parsing of '-' under the check that we parsed a comma. Unfortunately, this leads to a poor error, but I still have more known issues in this code and may end up with an overall restructuring and want to think about wording.	2025-04-02 21:51:31 -07:00
Craig Topper	3ea7902494	[RISCV] Check S0 register list check for qc.cm.pushfp to after we parsed the whole register list. (#134180 ) This is more of a semantic check. The diagnostic location to has been changed to point at the register list start instead of the closing brace or whatever character might be there instead of a brace if its malformed.	2025-04-02 21:48:48 -07:00
Sam Elliott	4998273189	Reland [RISCV] Add Xqci Insn Formats (#134134 ) This adds the following instruction formats from the Xqci Spec: - QC.EAI - QC.EI - QC.EB - QC.EJ - QC.ES The update to the THead test is because the largest number of operands for a valid instruction has been bumped by this change. This reverts commit 68fb7a5a1d203dde7badf67031bdd9eb650eef5d. This relands commit 0cfabd37df9940346f3bf8a4d74c19e1f48a00e9.	2025-04-02 21:37:44 -07:00
Jacob Lalonde	b8d8405238	[LLDB] Expose checking if the symbol file exists/is loaded via SBModule (#134163 ) The motivation for this patch is that in Statistics.cpp we [check to see if the module symfile is loaded](`990a086d9d/lldb/source/Target/Statistics.cpp (L353C60-L353C75)`) to calculate how much debug info has been loaded. I have an external utility that only wants to look at the loaded debug info, which isn't exposed by the SBAPI.	2025-04-02 21:27:44 -07:00
Reid Kleckner	e3c0565b74	Reapply "[cmake] Refactor clang unittest cmake" (#134195 ) This reapplies 5ffd9bdb50b57 (#133545) with fixes. The BUILD_SHARED_LIBS=ON build was fixed by adding missing LLVM dependencies to the InterpTests binary in unittests/AST/ByteCode/CMakeLists.txt .	2025-04-02 21:07:30 -07:00
Matt Arsenault	3140d51cf3	llvm-reduce: Remove unsupported from bitcode uselistorder test (#134185 ) This was disabled due to flakiness but I'm currently unable to reproduce. I'm nervous the original issue still exists. However, I downgraded the tripped assert in 8c18c25b1b22ea710edb40a4f167a6a8bfe6ff9d to a warning since the same assert can trigger for illegitimate reasons. Fixes #64157	2025-04-03 11:04:02 +07:00
Jason Molenda	4e40c7c4bd	[lldb][debugserver] Save and restore the SVE/SME register state (#134184 ) debugserver isn't saving and restoring the SVE/SME register state around inferior function calls. Making arbitrary function calls while in Streaming SVE mode is generally a poor idea because a NEON instruction can be hit and crash the expression execution, which is how I missed this, but they should be handled correctly if the user knows it is safe to do. rdar://146886210	2025-04-02 20:37:07 -07:00
LU-JOHN	6a46c6c865	Ensure KnownBits passed when calculating from range md has right size (#132985 ) KnownBits passed to computeKnownBitsFromRangeMetadata must have the same bit width as the range metadata bit width. Otherwise the calculated results will be incorrect. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-04-03 10:17:14 +07:00
Younan Zhang	dcc2182bce	[Clang] Fix a lambda pattern comparison mismatch after ecc7e6ce4 (#133863 ) In ecc7e6ce4, we tried to inspect the `LambdaScopeInfo` on stack to recover the instantiating lambda captures. However, there was a mismatch in how we compared the pattern declarations of lambdas: the constraint instantiation used a tailored `getPatternFunctionDecl()` which is localized in SemaLambda that finds the very primal template declaration of a lambda, while `FunctionDecl::getTemplateInstantiationPattern` finds the latest template pattern of a lambda. This difference causes issues when lambdas are nested, as we always want the primary template declaration. This corrects that by moving `Sema::addInstantiatedCapturesToScope` from SemaConcept to SemaLambda, allowing it to use the localized version of `getPatternFunctionDecl`. It is also worth exploring to coalesce the implementation of `getPatternFunctionDecl` with `FunctionDecl::getTemplateInstantiationPattern`. But I’m leaving that for the future, as I’d like to backport this fix (ecc7e6ce4 made the issue more visible in clang 20, sorry!), and changing Sema’s ABI would not be suitable in that regards. Hence, no release note. Fixes https://github.com/llvm/llvm-project/issues/133719	2025-04-03 11:15:42 +08:00
Pengcheng Wang	4986a79648	[TableGen] Emit `llvm::is_contained` for `CheckOpcode` predicate (#134057 ) When the list is large, using `llvm::is_contained` is of higher performance than a sequence of comparisons. When the list is small, the `llvm::is_contained` can be inlined and unrolled, which has the same effect as using a sequence of comparisons. And the generated code is more readable.	2025-04-03 11:11:36 +08:00
Owen Pan	4fe0d74275	[clang-format] Fix a bug in annotating braces (#134039 ) Fix #133873	2025-04-02 20:08:56 -07:00
Jerry-Ge	94dbe5e405	[mlir][tosa] Remove extra whitespace in the PadOp example (#134113 ) Trivial cleanup change. Signed-off-by: Jerry Ge <jerry.ge@arm.com>	2025-04-02 19:40:54 -07:00
Jonas Devlieghere	18c43d01fc	[lldb-dap] Add a -v/--version command line argument (#134114 ) Add a -v/--version command line argument to print the version of both the lldb-dap binary and the liblldb it's linked against. This is motivated by me trying to figure out which lldb-dap I had in my PATH.	2025-04-02 18:40:37 -07:00
tangaac	ff0c2fbd8e	[LoongArch] Pre-commit tests for vector absolute difference (#132898 )	2025-04-03 09:19:59 +08:00
Mircea Trofin	d59b2c4def	[ctxprof][nfc] Make `computeImportForFunction` a member of `ModuleImportsManager` (#134011 )	2025-04-02 18:18:17 -07:00
Mircea Trofin	02467f9e21	[ctxprof] Option to move a whole tree to its own module (#133992 ) Modules may contain a mix of functions that participate or don't participate in callgraphs covered by a contextual profile. We currently have been importing all the functions under a context root in the module defining that root, but if the other functions there are covered by flat profiles, the result is difficult to reason about. This patch allows moving everything under a context root (and that root) in its own module. For now, we expect a module with a filename matching the GUID of the function be present in the set of modules known by the linker. This mechanism can be improved in a later patch. Subsequent patches will handle implementing "move" instead of "import" semantics for the root function (because we want to make sure only one version of the root exists - so the optimizations we perform are actually the ones being observed at runtime).	2025-04-02 18:15:48 -07:00
Ankur Ahir	fb7135ec52	[Clang] fixed clang frontend crash with friend class declaration and overload == (#133878 )	2025-04-03 09:11:27 +08:00
Jon Roelofs	749c20b3e0	[LIT] Add a test for lit.Test.toMetricValue. NFC	2025-04-02 17:35:14 -07:00
Joseph Huber	e5809f0172	[LLVM] Only build the GPU loader utility if it has LLVM-libc (#134141 ) Summary: There were some discussions about this being included by default. I need to fix this up and codify the use of LLVM libc inside of LLVM. For now, just turn it off unless the user requested the `libc` GPU stuff. This matches the old behavior.	2025-04-02 19:26:19 -05:00
David Peixotto	b55bab2292	[lldb] Fix plugin manager test failure on windows (#134173 ) This is an attempt to fix a test failure from #133794 when running on windows builds. I suspect we are running into a case where the [ICF](https://learn.microsoft.com/en-us/cpp/build/reference/opt-optimizations?view=msvc-170) optimization kicks in and combines the CreateSystemRuntimePlugin* functions into a single address. This means that we cannot uniquely unregister the plugin based on its create function address. The fix is have each create function return a different (bogus) value.	2025-04-02 17:22:46 -07:00

1 2 3 4 5 ...

532995 Commits