llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-16 13:06:48 +00:00

Author	SHA1	Message	Date
Michał Górny	3676a86a43	[cmake] Add missing CMakePushCheckState include to FindLibEdit.cmake Add the missing include to fix an error when `cmake_push_check_state()` is called and incidentally the CMakePushCheckState module is not loaded by any other check running prior to `FindLibEdit.cmake`: CMake Error at /var/no-tmpfs/portage/dev-util/lldb-15.0.4/work/cmake/Modules/FindLibEdit.cmake:24 (cmake_push_check_state): Unknown CMake command "cmake_push_check_state". Call Stack (most recent call first): cmake/modules/LLDBConfig.cmake:52 (find_package) cmake/modules/LLDBConfig.cmake:59 (add_optional_dependency) CMakeLists.txt:28 (include) Gentoo Bug: https://bugs.gentoo.org/880065 Differential Revision: https://reviews.llvm.org/D137555	2022-11-07 18:20:19 +01:00
Miguel Saldivar	de36d39e24	[InstCombine] Avoid passing pow attributes to sqrt As described in issue #58475, we could pass the attributes of pow to sqrt and crash. Differential Revision: https://reviews.llvm.org/D137454	2022-11-07 12:07:37 -05:00
Sanjay Patel	b62c81b836	[VectorCombine] add test with non-canonical shuffle mask; NFC D137341	2022-11-07 12:07:37 -05:00
Slava Zakharin	ddb68f36ae	[flang] Initial support for FastMathAttr setup in FirOpBuilder. Provide FirOpBuilder::setFastMathFlags() to configure FastMathFlags for the builder. Set FastMathAttr for operations based on FirOpBuilder configuration via mlir::OpBuilder::Listener. This is a little bit hacky solution, because we lose the ability to hook other listeners to FirOpBuilder. There are also potential issues with OpBuilder::clone() - the hook will be invoked for cloned operations and will effectively overwrite FastMathAttr with the ones configured in FirOpBuilder, which should not be happening. We should teach mlir::OpBuilder about FastMathAttr setup in future. Reviewed By: jeanPerier, kiranchandramohan Differential Revision: https://reviews.llvm.org/D137390	2022-11-07 09:03:46 -08:00
Fangrui Song	90ad3e3c02	[IR] Allow available_externally GlobalAlias GlobalVariable and Function can be available_externally. GlobalAlias is used similarly. Allowing available_externally is a natural extension and helps ThinLTO discard GlobalAlias in a non-prevailing COMDAT (see D135427). For now, available_externally GlobalAlias must point to an available_externally GlobalValue (not ConstantExpr). Differential Revision: https://reviews.llvm.org/D137441	2022-11-07 09:03:23 -08:00
Stella Stamenova	ec224e3b68	Revert "[mlir][sparse] fix sparse tensor rewriting patterns that do not propagate sparse tensor SSA properly." This reverts commit 70508b614e6478ba2c3fc79e935e2c68e2d79b71. This change depends on a reverted change that broke the windows mlir buildbot; reverting to bring remaining mlir bots to green	2022-11-07 09:00:08 -08:00
Matt Arsenault	058f727a98	InstCombine: Add baseline checks for fdiv	2022-11-07 08:57:10 -08:00
Stella Stamenova	a2c4ca50ca	Revert "[mlir][sparse] support Parallel for/reduction." This reverts commit 838389780e56f1a198a94f66ea436359466bf5ed. This broke the windows mlir buildbot: https://lab.llvm.org/buildbot/#/builders/13/builds/27934	2022-11-07 08:48:52 -08:00
Matt Arsenault	7dd27a75a2	InstSimplify: Fold fdiv nnan ninf x, 0 -> poison https://alive2.llvm.org/ce/z/JxX5in	2022-11-07 08:43:22 -08:00
Matt Arsenault	1ce5f93d03	InstSimplify: Add new baseline tests for fdiv	2022-11-07 08:43:22 -08:00
Christopher Bate	708185f03f	[mlir][NVGPU] Add support for structured sparsity MMA variants This change adds a new NVGPU operation that targets the PTX `mma.sp.sync` instruction variants. A lowering to NVVM is provided using inline assembly. Reviewed By: ThomasRaoux, manishucsd Differential Revision: https://reviews.llvm.org/D137202	2022-11-07 09:43:03 -07:00
Nikita Popov	6ebca03021	[Clang] Update test after wasm intrinsics attribute change (NFC) I missed this test in d35fcf0e97e7bb02381506a71e61ec282b292c50.	2022-11-07 17:42:35 +01:00
Alexey Bataev	ecd0b5a532	Revert "[SLP]Redesign vectorization of the gather nodes." This reverts commit 8ddd1ccdf89317be1c40fa9183e214878a56151e to fix buildbots failures reported in https://lab.llvm.org/buildbot#builders/74/builds/14839	2022-11-07 08:35:21 -08:00
bixia1	cf24d49dc8	[mlir][sparse] Add sparse_tensor.sort_coo operator. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D137442	2022-11-07 08:23:51 -08:00
Nikita Popov	d35fcf0e97	[WebAssembly] Use default attributes for intrinsics This switches wasm intrinsics to use default attributes, i.e. nofree, nosync, nocallback and willreturn. Especially willreturn will be required to avoid optimization regressions in the future. The attributes are omitted from the trapping fptoi intrinsics (where I assume trapping is considered well-defined, and as such these aren't willreturn), the throw/rethrow intrinsics (which will unwind) and the atomic intrinsics (which aren't nosync). Differential Revision: https://reviews.llvm.org/D137551	2022-11-07 17:05:36 +01:00
Nikita Popov	a50c269c73	[InstCombine] Handle load smaller than one byte in memset forward APInt::getSplat() requires that the new size is >= the original one. If we're loading less than 8 bits, truncate instead. Fixes https://github.com/llvm/llvm-project/issues/58845.	2022-11-07 17:04:27 +01:00
Mingming Liu	36e8e19337	[NFC][BlockPlacement]Add an option to renumber blocks based on function layout order. Use case: - When block layout is visualized after MBP pass, the basic blocks are labeled in layout order; meanwhile blocks could be numbered in a different order. - As a result, it's hard to map between the graph and pass output. With this option on, the basic blocks are renumbered in function layout order. This option is only useful when a function is to be visualized (i.e., when view options are on) to make it debugging only. Use https://godbolt.org/z/5WTW36bMr as an example: - As MBP pass output (shown in godbolt output window), `func2` is in a basic block numbered `2` (`bb.2`), and `func1` is in a basic block numbered `3` (`bb.3`); `bb.3` is a block with higher block frequency than `bb.2`, and `bb.3` is placed before `bb.2` in the functin layout. - Use [1] to get the dot graph (graph uploaded in [2]), the blocks are re-numbered. - `func1` is in 'if.end' block, and labeled `1` in visualized dot; `func2` is in 'if.then' blocks, and labeled `3` --> the labeled number and bb number won't map. - [[ `b5626ae975/llvm/lib/CodeGen/MachineBlockFrequencyInfo.cpp (L127)` \| DOTGraphTraits<MachineBlockFrequencyInfo *>::getNodeLabel ]] is where labeled numbers are based on function layout number, and [[ `a8d93783f3/llvm/include/llvm/Support/GraphWriter.h (L209)` \| called by graph writer ]]. So call 'MachineFunction::RenumberBlocks' would make labeled number (in dot graph) and block number (in pass output) consistent with each other. [1] `./bin/clang++ -O3 -S -mllvm -view-block-layout-with-bfi=count -mllvm -view-bfi-func-name=_Z9func_loopv -mllvm -print-after=block-placement -mllvm -filter-print-funcs=_Z9func_loopv test.c` [2] {F25201785} Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D137467	2022-11-07 07:52:45 -08:00
David Sherwood	a9d7b18b4a	[AArch64][SVE2] Add the SVE2.1 quadword variants of ld1w/ld1d/st1w/st1d This patch adds the assembly/disassembly for the following instructions: st1w: Contiguous store words from vector (128-bit vector elements) st1d: Contiguous store doublewords from vector (128-bit vector elements) ld1w: Contiguous load unsigned words to vector (128-bit vector elements) ld1d: Contiguous load unsigned doublewords to vector (128-bit vector elements) The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D137245	2022-11-07 15:51:09 +00:00
Matt Devereau	a8c24d57b8	[InstCombine] Remove redundant splats in InstCombineVectorOps Splatting the first vector element of the result of a BinOp, where any of the BinOp's operands are the result of a first vector element splat can be simplified to splatting the first vector element of the result of the BinOp Differential Revision: https://reviews.llvm.org/D135876	2022-11-07 15:39:05 +00:00
bixia1	9b800bf79d	[mlir][sparse] Improve the non-stable sort implementation. Replace the quick sort partition method with one that is more similar to the method used by C++ std quick sort. This improves the runtime for sorting sk_2005.mtx by more than 10x. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D137290	2022-11-07 07:38:42 -08:00
David Sherwood	cf69895ab3	[AArch64][SVE2] Add the SVE2.1 BF16 instructions This patch adds the new FEAT_B16B16 feature as well as the assembly/disassembly for all of the B16B16 instructions: bfadd: BFloat16 floating-point add vectors bfsub: BFloat16 floating-point subtract vectors bfmul: BFloat16 floating-point multiply vectors bfclamp: BFloat16 floating-point clamp to minimum/maximum number bfmax: BFloat16 floating-point maximum bfmaxnm: BFloat16 floating-point maximum number bfmin: BFloat16 floating-point minimum bfminnm: BFloat16 floating-point minimum number bfmla: BFloat16 floating-point fused multiply-add vectors bfmls: BFloat16 floating-point fused multiply-subtract vectors The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D137321	2022-11-07 15:29:40 +00:00
Simon Pilgrim	5c0cb75787	[X86] Folded MOVDDUPrm has the same sched behaviour as MOVSHDUPrm/MOVSLDUPrm on Haswell/IceLake There can be a difference for MOVDDUPrr but not the load folded broadcast that is purely on Port23 Fixes an old TODO (inherited from SkylakeServer which was fixed at c7662dc3e52801ec824d8473278fb976107d3e57) Confirmed on Agner + uops.info	2022-11-07 15:17:32 +00:00
Matt Arsenault	0f68ffe1e2	InstCombine: Fold compare with smallest normal if input denormals are flushed Try to simplify comparisons with the smallest normalized value. If denormals will be treated as 0, we can simplify by using an equality comparison with 0. fcmp olt fabs(x), smallest_normalized_number -> fcmp oeq x, 0.0 fcmp ult fabs(x), smallest_normalized_number -> fcmp ueq x, 0.0 fcmp oge fabs(x), smallest_normalized_number -> fcmp one x, 0.0 fcmp ult fabs(x), smallest_normalized_number -> fcmp ueq x, 0.0 The device libraries have a few range checks that look like this for denormal handling paths.	2022-11-07 07:16:47 -08:00
Matt Arsenault	b9b74fc6e9	InstCombine: Add baseline tests for fcmp and select on denormal range A future change will try to fold (if input denormals are treated as 0) fcmp olt fabs(x), smallest_normalized_number -> fcmp oeq x, 0.0 fcmp ult fabs(x), smallest_normalized_number -> fcmp ueq x, 0.0 fcmp oge fabs(x), smallest_normalized_number -> fcmp one x, 0.0 fcmp ult fabs(x), smallest_normalized_number -> fcmp ueq x, 0.0	2022-11-07 07:16:47 -08:00
OCHyams	80378a4ca7	[NFC] Move getDebugValueLoc from static in Local.cpp to DebugInfo.h Move getDebugValueLoc so that it can be accessed from DebugInfo.h for the Assignment Tracking patch stack and remove redundant parameter Src. Reviewed By: jryans Differential Revision: https://reviews.llvm.org/D132357	2022-11-07 15:14:43 +00:00
Alexey Bataev	8ddd1ccdf8	[SLP]Redesign vectorization of the gather nodes. Gather nodes are vectorized as simply vector of the scalars instead of relying on the actual node. It leads to the fact that in some cases we may miss incorrect transformation (non-matching set of scalars is just ended as a gather node instead of possible vector/gather node). Better to rely on the actual nodes, it allows to improve stability and better detect missed cases. Differential Revision: https://reviews.llvm.org/D135174	2022-11-07 07:04:38 -08:00
OCHyams	4c44fa1c38	[Assignment Tracking][5.1/] Add deleteAssignmentMarkers function deleteAssignmentMarkers(const Instruction Inst) does exactly as you'd expect - it deletes any dbg.assign intrinsics linked to Inst. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D133576	2022-11-07 14:56:36 +00:00
David Sherwood	12a6572d41	[AArch64] Add SME2.1 target feature for Armv9-A 2022 Architecture Extension First patch in a series adding MC layer support for SME2.1. This patch adds the following feature: sme2p1 The reference can be found here: https://developer.arm.com/documentation/ddi0602/2022-09 Differential Revision: https://reviews.llvm.org/D137410	2022-11-07 14:38:28 +00:00
Nikita Popov	9a45e4beed	[MemCpyOpt] Move lifetime marker before call to enable call slot optimization Currently call slot optimization may be prevented because the lifetime markers for the destination only start after the call. In this case, rather than aborting the transform, we should move the lifetime.start before the call to enable the transform. Differential Revision: https://reviews.llvm.org/D135886	2022-11-07 15:26:00 +01:00
Oleg Shyshkov	bada35390a	[mlir][NFC] Remove unnecessary attr name getters from StructuredOpsUtils.h. Those methods were added long time ago. Now we get the same methods generated by tablegen, so there is no need for duplicates. Differential Revision: https://reviews.llvm.org/D137544	2022-11-07 14:40:56 +01:00
Daniel Grumberg	39dbfa72aa	Revert "Only add targetFallback if target is not in defined in current product" This was an accidental addition of a non-reviewed change. This reverts commit f63db9159bbbb0db98e13cb4440fdaa5c40e219b.	2022-11-07 13:33:59 +00:00
Daniel Grumberg	f63db9159b	Only add targetFallback if target is not in defined in current product	2022-11-07 13:12:34 +00:00
Daniel Grumberg	671709f0e7	[clang][ExtractAPI] Add targetFallback to relationships in symbol graph Adds a 'targetFallback' field to relationships in symbol graph that contains the plain name of the relationship target. This is useful for clients when the relationship target symbol is not available. Differential Revision: https://reviews.llvm.org/D136455	2022-11-07 13:12:34 +00:00
Dmitry Preobrazhensky	8f68952183	[AMDGPU][MC][GFX11][NFC] Correct VINTERP src operands Differential Revision: https://reviews.llvm.org/D137238	2022-11-07 15:52:55 +03:00
Dmitry Preobrazhensky	6e279f5bb6	[AMDGPU][MC][GFX10+] Enable literal operands with permlane16/permlanex16 Differential Revision: https://reviews.llvm.org/D137332	2022-11-07 15:49:21 +03:00
OCHyams	028df7fab1	Fix warning: comparison of integers of different signs Buildbot failure: https://lab.llvm.org/buildbot/#/builders/36/builds/26925 Review & commit: https://reviews.llvm.org/D132224 https://reviews.llvm.org/rG171f7024cc82e8702abebdedb699d37b50574be7	2022-11-07 12:35:51 +00:00
Simon Pilgrim	4aabbc0c85	[X86] Flatten WriteShift/Rotate SchedRW defs Some "inner" defs were being overriding "outer" SchedRW defs, making it very tricky to track what schedule was being used. Noticed as I'm trying to remove a lot of unnecessary shift/rotate RMW overrides from the scheduler models	2022-11-07 12:29:43 +00:00
OCHyams	171f7024cc	[Assignment Tracking][5/] Add core infrastructure for instruction reference The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Overview It's possible to find intrinsics linked to an instruction by looking at the MetadataAsValue uses of the attached DIAssignID. That covers instruction -> intrinsic(s) lookup. Add a global DIAssignID -> instruction(s) map which gives us the ability to perform intrinsic -> instruction(s) lookup. Add plumbing to keep the map up to date through optimisations and add utility functions including two that perform those lookups. Finally, add a unittest. Details In llvm/lib/IR/LLVMContextImpl.h add AssignmentIDToInstrs which maps DIAssignID attachments to Instruction s. Because the DIAssignID is the key we can't use a TrackingMDNodeRef for it, and therefore cannot easily update the mapping when a temporary DIAssignID is replaced. Temporary DIAssignID's are only used in IR parsing to deal with metadata forward references. Update llvm/lib/AsmParser/LLParser.cpp to avoid using temporary DIAssignID's for attachments. In llvm/lib/IR/Metadata.cpp add Instruction::updateDIAssignIDMapping which is called to remove or add an entry (or both) to AssignmentIDToInstrs. Call this from Instruction::setMetadata and add a call to setMetadata in Intruction's dtor that explicitly unsets the DIAssignID so that the mappging gets updated. In llvm/lib/IR/DebugInfo.cpp and DebugInfo.h add utility functions: getAssignmentInsts(const DbgAssignIntrinsic DAI) getAssignmentMarkers(const Instruction Inst) RAUW(DIAssignID Old, DIAssignID New) deleteAll(Function *F) These core utils are tested in llvm/unittests/IR/DebugInfoTest.cpp. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D132224	2022-11-07 12:03:02 +00:00
Christian Kandeler	2bf960aef0	[clangd] Add "usedAsMutablePointer" highlighting modifier Counterpart to "usedAsMutableReference". Just as for references, there are const and non-const pointer parameters, and it's valuable to be able to have different highlighting for the two cases at the call site. We could have re-used the existing modifier, but having a dedicated one maximizes client flexibility. Reviewed By: nridge Differential Revision: https://reviews.llvm.org/D130015	2022-11-07 11:58:33 +01:00
OCHyams	c37f29c49e	[Assignment Tracking][4/] Add llvm.dbg.assign intrinsic boilerplate The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Add the llvm.dbg.assign intrinsic boilerplate. This updates the textual-bitcode roundtrip test to also check that round-tripping with the intrinsic works. The intrinsic marks the position of a source level assignment. The llvm.dbg.assign interface looks like this (each parameter is wrapped in MetadataAsValue, and Value type parameters are first wrapped in ValueAsMetadata): void @llvm.dbg.assign(Value Value, DIExpression ValueExpression, DILocalVariable Variable, DIAssignID ID, Value Address, DIExpression AddressExpression) The first three parameters look and behave like an llvm.dbg.value. ID is a reference to a store. The intrinsic is "linked to" instructions in the same function that use the same ID as an attachment. That is mostly conceptual at this point; the two-way link infrastructure will come in another patch. Address is the destination address of the store and it is modified by AddressExpression. LLVM currently encodes variable fragment information in DIExpressions, so as an implementation quirk the FragmentInfo for Variable is contained within ValueExpression only. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D132223	2022-11-07 10:09:22 +00:00
David Green	b46427b9a2	[InstSimplify] (~A & B) \| ~(A \| B) --> ~A with logical and According to https://alive2.llvm.org/ce/z/opsdrb, it is valid to convert (~A & B) \| ~(A \| B) --> ~A even if the And is a Logical And. This came up from the vector masking of predicated blocks. Differential Revision: https://reviews.llvm.org/D137435	2022-11-07 10:03:18 +00:00
Thomas Preud'homme	c8be35293c	[SWP] Recognize mem carried dep with different base The loop-carried dependency detection logic in isLoopCarriedDep relies on the load and store using the same definition for the base register. This misses the case of post-increment loads and stores whose base register are different PHI initialized from the same initial value. This commit extends the logic to accept the load and store having different PHI base address provided that they had the same initial value when entering the loop and are incremented by the same amount in each loop. Reviewed By: bcahoon Differential Revision: https://reviews.llvm.org/D136463	2022-11-07 09:53:41 +00:00
Chen Zheng	eb421c0c0e	[PowerPC][NFC] fix the LIT regressions This is to fix the wrong checking introdued in D64195. `std {{[0-9]+}}, 16(1)` is the store for the lr register. It breaks previous testing point before D64195.	2022-11-07 04:17:14 -05:00
chenglin.bi	83255c4a62	Recommit [AArch64] Improve codegen for shifted mask op The original change compares `APInt` to check the constant is the same or not. But shift amount may have different constant types. So, this patch change to use `getZExtValue` to compare constant value. Original comment: The special case for bit extraction pattern is `((x >> C) & mask) << C`. It can be combined to `x & (mask << C)` by return true in isDesirableToCommuteWithShift. Fix: #56427 Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D136014	2022-11-07 17:16:35 +08:00
OCHyams	a2620e00ff	[Assignment Tracking][3/*] Add DIAssignID metadata boilerplate The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir Add the DIAssignID metadata attachment boilerplate. Includes a textual-bitcode roundtrip test and tests that the verifier and parser catch badly formed IR. This piece of metadata links together stores (used as an attachment) and the yet-to-be-added llvm.dbg.assign debug intrinsic (used as an operand). Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D132222	2022-11-07 09:05:56 +00:00
Phoebe Wang	d9176563dc	[X86] Add missing `IntrArgMemOnly` for intrinsics Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D137406	2022-11-07 17:04:37 +08:00
David Green	9e885d9aab	[InstSimplify] Add tests for (~A & B) \| ~(A \| B) --> ~A with logical And. NFC	2022-11-07 09:04:06 +00:00
Timm Bäder	5dfacb1245	[clang][Interp][NFC] Replace dyn_cast_or_null with _if_present ... in Descriptor.h	2022-11-07 09:42:41 +01:00
Timm Bäder	5bd6bd1227	[clang][Interp][NFC] Simplify visitReturnStmt()	2022-11-07 09:42:41 +01:00
Timm Bäder	6b3e5c595b	[clang][Interp][NFC] Remove unused function	2022-11-07 09:42:41 +01:00

1 2 3 4 5 ...

441129 Commits