llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-01 10:46:07 +00:00

Author	SHA1	Message	Date
Florian Hahn	2e6430666c	[LV] Update recipe builder functions to pass VPlan directly (NFC). Passing VPlanPtr requires a dereference of std::unique_ptr on each access, which is unnecessary. Just pass the plan by reference.	2023-02-12 22:35:14 +00:00
Kazu Hirata	15cb5ebed7	[Support] Use llvm::popcount (NFC) This should fix builds on Windows.	2023-02-12 13:39:18 -08:00
Lang Hames	be2fc577c3	[ORC] Add MachOPlatform::Create overload -- Pass ORC runtime as def generator. The existing Create method took a path to the ORC runtime and created a StaticLibraryDefinitionGenerator for it. The new overload takes a std::unique_ptr<DefinitionGenerator> directly instead. This provides more flexibility when constructing MachOPlatforms. E.g. The runtime archive can be embedded in a special section in the ORC controller executable or library, rather than being on-disk.	2023-02-12 13:30:37 -08:00
Simon Pilgrim	19c1682b6a	[X86] combineConcatVectorOps - concatenate 512-bit VPERMILPS nodes.	2023-02-12 18:26:28 +00:00
Simon Pilgrim	faf5616e11	BlockFrequencyInfoImpl.cpp - add missing closing namespace comment. NFC Fixes clang-tidy llvm-namespace-comment warning	2023-02-12 16:42:28 +00:00
Simon Pilgrim	1bb95a3a99	[X86] combinePredicateReduction - attempt to fold subvector all_of(icmp_eq()) / any_of(icmp_ne()) to integers Noticed while working on Issue #59867 and Issue #53419 - there's still more to do here, but for "all vector" comparisons, we should try to cast to a scalar integer for sub-128bit types	2023-02-12 15:23:47 +00:00
Simon Pilgrim	738370ae0e	DemandedBits.cpp - use auto* when initializing from cast<>. NFC. Silence clang-tidy warnings	2023-02-12 14:57:11 +00:00
Simon Pilgrim	1300a4fdae	Revert rG23cb32c6d5bda0919cc1ef129917ceb2dbf1b1b8 "[X86] combineX86ShufflesRecursively - treat ISD::TRUNCATE as faux shuffle" This is causing a miscompile - waiting on a regression test from @bkramer	2023-02-12 14:46:08 +00:00
Martin Storsjö	7717e1114a	Revert "[AArch64] Reassociate sub(x, add(m1, m2)) to sub(sub(x, m1), m2)" This reverts commit c52255d26a23df6ecf09f60ca3e3615467f16bbe. That commit caused certain files (in ffmpeg, libvpx and libaom) to hang while compiling, see https://reviews.llvm.org/D143143 for repro.	2023-02-12 16:00:32 +02:00
Sanjay Patel	f48f178717	[InstCombine] canonicalize cmp+select as smin/smax (V == SMIN) ? SMIN+1 : V --> smax(V, SMIN+1) (V == SMAX) ? SMAX-1 : V --> smin(V, SMAX-1) https://alive2.llvm.org/ce/z/d5bqjy Follow-up for the unsigned variants added with: 86b4d8645fc1b866 issue #60374	2023-02-12 07:54:43 -05:00
NAKAMURA Takumi	0e18b5feaa	LLVMFuzzerCLI: [CMake] Prune the last PARTIAL_SOURCES_INTENDED to cover all sources.	2023-02-12 20:12:37 +09:00
Hsiangkai Wang	c9a7b92a23	[AArch64] Consider tiny code model in emitLoadFromConstantPool. We should be able to use load(literal) to access constant pool under the tiny code model. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D132536	2023-02-12 06:02:47 +00:00
Kazu Hirata	df3b703a4c	[AArch64] Use llvm::countr_{zero,one} (NFC)	2023-02-11 17:53:01 -08:00
Craig Topper	c8ad1de4f0	[RISCV] Remove dead code from RISCVDAGToDAGISel::selectVSETVLI. NFC vsetvli no longer has side effects so we don't need code for handling INSTRINSIC_W_CHAIN.	2023-02-11 16:51:35 -08:00
Craig Topper	7e772e12d1	[RISCV] Fix mistake in comment. NFC	2023-02-11 12:32:54 -08:00
Lang Hames	10b5fec256	[JITLink][ORC] Add LinkGraph::allocateCString method. Renames the existing allocateString method to allocateContent and adds a pair of allocateCString methods. The previous allocateString method did not include a null-terminator. It behaved the same as allocateContent except with a Twine input, rather than an ArrayRef<char>. Renaming allocateString to allocateBuffer (overloading the existing method) makes this clearer. The new allocateCString methods allocate the given content plus a null-terminator character, and return a buffer covering both the string and null-terminator. This makes them suitable for creating c-string content for jitlink::Blocks. Existing users of the old allocateString method have been updated to use the new allocateContent overload.	2023-02-11 12:05:28 -08:00
Simon Pilgrim	23cb32c6d5	[X86] combineX86ShufflesRecursively - treat ISD::TRUNCATE as faux shuffle getFauxShuffleMask can't handle ISD::TRUNCATE itself as it can't handle inputs that are larger than the output Another step towards removing combineX86ShuffleChainWithExtract	2023-02-11 19:16:08 +00:00
Lang Hames	9eccc6cce0	[JITLink] Add a predicate to test for C-string blocks.	2023-02-11 10:51:50 -08:00
Lang Hames	3d4e9d5eb0	[ORC] Move ORC-specific object format details into OrcShared. This allows these details to be shared with JITLink, which is allowed to depend on the OrcShared library (but not on OrcJIT).	2023-02-11 10:51:38 -08:00
Simon Pilgrim	a55b35dbee	[X86] combineVectorInsert - pull out Vec/Scl/Idx operands. NFC. These will be reused in a future patch	2023-02-11 14:02:00 +00:00
Simon Pilgrim	0b0a38a7a2	[X86] combineX86ShufflesRecursively - don't widen shuffle subvector inputs combineX86ShuffleChain and combineX86ShuffleChainWithExtract no longer require the shuffle inputs to be the same width as the root vector, so we can stop generating widening nodes on the fly (combineX86ShuffleChain should handles all of this). This requires a couple of additional folds to avoid a couple of notable regressions: getFauxShuffleMask - recognise INSERT_SUBVECTOR(X,Y,C) as a shuffle pattern as long as its not just widening the subvector. combineConcatVectorOps - folds CONCAT_VECTORS(AssertSext(X,Ty),AssertSext(Y,Ty)) -> AssertSext(CONCAT_VECTORS(X,Y),Ty) One of the final stages towards fixing Issue #45319 and addressing the regressions in the interleaved tests in D127115	2023-02-11 13:23:04 +00:00
Darshan Bhat	19c42f672f	[DFAPacketizer] Move DefaultVLIWScheduler class declaration to header file This change moves "DefaultVLIWScheduler" class declaration from DFAPacketizer.cpp to DFAPacketizer.h. This is needed because there is a protected class member of type "DefaultVLIWScheduler*" in "VLIWPacketizerList" class. The derived classes cannot use this memeber unless declaration is available to it. More specifically : // Without this change ``` class HexagonPacketizerList : public VLIWPacketizerList { public : HexagonPacketizerList() { // Below line will cause incomplete class error since // declaration was not available through header. VLIWScheduler->schedule(); } } ``` Reviewed By: kparzysz Differential Revision: https://reviews.llvm.org/D139767	2023-02-11 14:31:58 +05:30
Jay Foad	811d11b064	[AMDGPU] Add GFX11 HW_REG_PERF_SNAPSHOT_* These are similar to hardware registers already added for GFX940, but with different numbers and slightly different names. Differential Revision: https://reviews.llvm.org/D143740	2023-02-10 20:28:14 +00:00
Alex Brachet	3e57aa304f	[llvm-driver] Reinvoke clang as described by llvm driver extra args Differential Revision: https://reviews.llvm.org/D137800	2023-02-10 19:42:32 +00:00
Changpeng Fang	7ca3444fba	AMDGPU: Use module flag to get code object version at IR level folow-up Summary: This is part of the leftover work for https://reviews.llvm.org/D143138. In this work, we pass code object version as an argument to initialize target ID and use it for targetID dump. Reviewers: arsenm Differential Revision https://reviews.llvm.org/D143293	2023-02-10 11:16:38 -08:00
Arthur Eubanks	c8b8d6badd	[Passes] Remove some legacy passes Namely CrossDSOCFI and GlobalSplit. These are part of the optimization pipeline, of which the legacy pass manager version is deprecated.	2023-02-10 10:46:45 -08:00
OCHyams	295f5fafcb	[Assignment Tracking] Fix migrateDebuginfo in SROA Without this patch, migrateDebugInfo doesn't understand how to handle existing fragments that are smaller than the to-be-split store. This can occur if. e.g. a vector store (1 dbg.assign) is split (many dbg.assigns - 1 fragment for each scalar) and later those stores are re-vectorized (many dbg.assigns), and then SROA runs on that. The approach taken in this patch is to drop intrinsics with fragments outside of the slice. For example, starting with: store <2 x float> %v, ptr %dest !DIAssignID !1 call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 0, 32), !1, ...) call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 32, 32), !1, ...) When visiting the slice of bits 0 to 31 we get: store float %v.extract.0, ptr %dest !DIAssignID !2 call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 0, 32), !2, ...) The other dbg.assign associated with the currently-split store is dropped for this split part. And visiting bits 32 to 63 we get the following: store float %v.extract.1, ptr %adjusted.dest !DIAssignID !3 call void @llvm.dbg.assign(..., DIExpression(DW_OP_LLVM_fragment, 32, 32), !3, ...) I've added two tests that cover this case. Implementing this meant re-writing the fragment-calculation part of migrateDebugInfo to work with the absolute offset of the new slice in terms of the base alloca (instead of the offset of the slice into the new alloca), the fragment (if any) of the variable associated with the base alloca, and the fragment associated with the split store. Because we need the offset into the base alloca for the variables being split, some careful wiring is required for memory intrinsics due to the fact that memory intrinsics can be split when either the source or dest allocas are split. In the case where the source alloca drives the splitting, we need to be careful to pass migrateDebugInfo the information in relation to the dest alloca. Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D143146	2023-02-10 18:10:11 +00:00
David Green	c52255d26a	[AArch64] Reassociate sub(x, add(m1, m2)) to sub(sub(x, m1), m2) The mid end will reassociate sub(sub(x, m1), m2) to sub(x, add(m1, m2)). This reassociates it back to allow the creation of more mls instructions. Differential Revision: https://reviews.llvm.org/D143143	2023-02-10 18:09:11 +00:00
Craig Topper	d37a31cf23	[X86] Attempt to fix ubsan failure. operator~ promote the single bit input to int. The ~ will cause the upper 31 bits to become 1s making it a negative value. This is undefined for shift. Mask it back down to a single bit. The extra 1s were being shifted to bit 8 and above and the they aren't used by the emitByte call so this shouldn't be a functional change.	2023-02-10 10:02:51 -08:00
Johannes Doerfert	1763c63254	[Attributor][NFCI] Use a set to track dependences	2023-02-10 11:56:09 -06:00
Johannes Doerfert	86cce90e21	[Attributor][NFCI] Avoid AAIntraFnReachability updates if possible Even if liveness changed, we only care about certain dead edges in AAIntraFnReachability. If those are still dead, we can avoid an update.	2023-02-10 11:56:09 -06:00
Johannes Doerfert	a9557aacd1	[Attributor][NFCI] Use queries without exclusion set whenever possible If a query uses an exclusion set but we haven't used it to determine the result, we can cache the query without exclusion set too. When we lookup a cached result we can check for the non-exclusion set version first.	2023-02-10 11:56:09 -06:00
Johannes Doerfert	76a1919026	[Attributor][NFC] Avoid unnecessary string operations This caused multiple string operations which we don't need if we do not create a profile.	2023-02-10 11:56:09 -06:00
Johannes Doerfert	bf9964fb13	[Attributor][NFCI] Create a AAIsDead for the function eagerly	2023-02-10 11:56:09 -06:00
Johannes Doerfert	8bc0bee2f8	[Attributor][NFCI] Avoid a temporary vector and exit early This change simply avoids the temporary vector and processes the elments right away.	2023-02-10 11:56:09 -06:00
Michael Buch	b8ef007fca	Reland "[llvm][dsymutil] Add DW_TAG_imported_declaration to accelerator table" This relands the commit previously reverted in `8570bee53a8ce0c5d04bc11f288e19a457474c4c` due to failures on linux. The problem was that the test executable was built with absolute OSO prefix paths. This re-commit adds a modified version of the executable that strips the absolute OSO prefix paths and makes sure the test appends the OSO prefix appropriately (via the appropriate dsymutil flags). Differential Revision: https://reviews.llvm.org/D143458	2023-02-10 17:19:07 +00:00
Sanjay Patel	af39acda88	[VectorCombine] fix insertion point of shuffles As shown in issue #60649, the new shuffles were being inserted before a phi, and that is invalid. It seems like most test coverage for this fold (foldSelectShuffle) lives in the AArch64 dir, but this doesn't repro there for a base target.	2023-02-10 10:57:11 -05:00
Sanjay Patel	78056e2f2d	[InstCombine] propagate FMF in exp2->ldexp fold	2023-02-10 10:02:25 -05:00
Sanjay Patel	3abea2b544	[InstCombine] copy tail markings in exp2->ldexp fold	2023-02-10 10:02:25 -05:00
David Green	86bfeb906e	Revert "Inlining: Run the legacy AlwaysInliner before the regular inliner." This seems to cause large regressions in existing code, as much as 75% slower (4x the time taken). Small always inline functions seem to be used a lot in the cmsis-dsp library. I would add a phase ordering test to show the problems, but one already exists! The llvm/test/Transforms/PhaseOrdering/ARM/arm_mult_q15.ll was just changed by removing alwaysinline to hide the problems that existed. This reverts commit cae033dcf227aeecf58fca5af6fc7fde1fd2fb4f. This reverts commit 8e33c41e72ad42e4c27f8cbc3ad2e02b169637a1.	2023-02-10 15:01:49 +00:00
Juan Manuel MARTINEZ CAAMAÑO	c4a250ecea	[AMDGPU][MC] Generate relative relocations for allocatable (more particularly, eh_frame) sections Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D142453	2023-02-10 15:54:43 +01:00
Benjamin Maxwell	f1837c7074	[DebugInfo] Handle missed DW_FORM_addrx3 and DW_FORM_strx3 cases This fixes a few places where the addrx3 and strx3 forms were missed. Previously this meant if one of these forms appeared somewhere various errors could occur. This now also adds an extra test case for the addrx3 form (which previously failed). Differential Revision: https://reviews.llvm.org/D143488	2023-02-10 14:44:18 +00:00
Simon Pilgrim	a3060f0f37	[X86] combineConcatVectorOps - concatenate AVX512 vselect nodes. NFC. This also requires us to constant fold vXi1 concat_vector nodes	2023-02-10 14:05:35 +00:00
OCHyams	25d0f3c4d0	[Assignment Tracking] Fix fragment index error in getDerefOffsetInBytes Without this patch `getDerefOffsetInBytes` incorrectly always returns `std::nullopt` for expressions with fragments due to an off-by-one error with fragment element indices. Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D143567	2023-02-10 13:49:05 +00:00
Sanjay Patel	9dcd7195a2	[InstCombine] avoid crashing in pow->ldexp Similar to 62a0a1b9eea7788c1f9dbae - We have pow math intrinsics in IR, but no ldexp intrinsics to handle vector types. A patch for that was proposed in D14327, but it was not completed. Issue #60605	2023-02-10 08:03:13 -05:00
Sanjay Patel	62a0a1b9ee	[InstCombine] avoid crashing in exp2->ldexp We have exp2 math intrinsics in IR, but no ldexp intrinsics to handle vector types. A patch for that was proposed in D14327, but it was not completed. Issue #60605	2023-02-10 07:35:39 -05:00
Tim Northover	c4ce967e34	ARM: skip debug instructions when matching jump-table patterns. When working out whether we can see a compressible jump-table pattern during ConstantIslands, we were stopping when we saw a debug instruction. Instead it's better to keep iterating backwards to the first real instruction. https://reviews.llvm.org/D142019	2023-02-10 12:27:59 +00:00
Ivan Kosarev	f0f8ae7596	[AMDGPU][AsmParser] Fix matching immediate literals. Prevents potential matching of literal offsets to non-literal operands. Reviewed By: dp Differential Revision: https://reviews.llvm.org/D142194	2023-02-10 11:36:07 +00:00
Dominik Adamski	baca3c1507	Move SIMD alignment calculation to LLVM Frontend Currently default simd alignment is defined by Clang specific TargetInfo class. This class cannot be reused for LLVM Flang. That's why default simd alignment calculation has been moved to OMPIRBuilder which is common for Flang and Clang. Previous attempt: https://reviews.llvm.org/D138496 was wrong because the default alignment depended on the number of built LLVM targets. If we wanted to calculate the default alignment for PPC and we hadn't specified PPC LLVM target to build, then we would get 0 as the alignment because OMPIRBuilder couldn't create PPCTargetMachine object and it returned 0 as the default value. If PPC LLVM target had been built earlier, then OMPIRBuilder could have created PPCTargetMachine object and it would have returned 128. Differential Revision: https://reviews.llvm.org/D141910 Reviewed By: jdoerfert	2023-02-10 04:11:54 -06:00
Dmitry Makogon	c77c186a64	[LVI] Don't traverse uses when calculating range at use This effectively reverts 5c38c6a and 4f772b0. A recently introduced LazyValueInfo::getConstantRangeAtUse returns incorrect ranges for values in certain cases. One such example is described in PR60629. The issue has something to do with traversing PHI uses of a value transitively. As nikic pointed out, we're effectively reasoning about values from different loop iterations. In the faulting test case, CVP made a miscompilation because the calculated range for a shift argument was incorrect. It returned empty-set, however it is clearly not a dead code. CVP then erased the shift instruction because of empty range.	2023-02-10 17:06:36 +07:00

1 2 3 4 5 ...

166738 Commits