llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-10 16:06:05 +00:00

Author	SHA1	Message	Date
Luo, Yuanke	40222ddcf8	[X86] Fix the vnni machine combine issue. The previous patch (D148980) didn't set the InstrIdxForVirtReg correctly in genAlternativeDpCodeSequence(). It causes vnni lit test failure when LLVM_ENABLE_EXPENSIVE_CHECKS is on.	2023-04-29 13:51:08 +08:00
Luo, Yuanke	8f7f9d86a7	[X86] Machine combine vnni instruction. "vpmaddwd + vpaddd" can be combined to vpdpwssd and the latency is reduced after combination. However when vpdpwssd is in a critical path the combination get less ILP. It happens when vpdpwssd is in a loop, the vpmaddwd can be executed in parallel in multi-iterations while vpdpwssd has data dependency for each iterations. If vpaddd is in a critical path while vpmaddwd is not, it is profitable to split vpdpwssd into "vpmaddwd + vpaddd". This patch is based on the machine combiner framework to acheive decision on "vpmaddwd + vpaddd" combination. The typical example code is as below. ``` __m256i foo(int cnt, __m256i c, __m256i b, __m256i *p) { for (int i = 0; i < cnt; ++i) { __m256i a = p[i]; __m256i m = _mm256_madd_epi16 (b, a); c = _mm256_add_epi32(m, c); } return c; } ``` Differential Revision: https://reviews.llvm.org/D148980	2023-04-27 16:42:04 +08:00
Akshay Khadse	43b38696aa	Fix uninitialized class members Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148692	2023-04-20 11:18:34 +08:00
Akshay Khadse	8bf7f86d79	Fix uninitialized pointer members in CodeGen This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303	2023-04-17 16:32:46 +08:00
Anton Sidorenko	2693efa8a5	[MachineCombiner] Support local strategy for traces For in-order cores MachineCombiner makes better decisions when the critical path is calculated only for the current basic block and does not take into account other blocks from the trace. This patch adds a virtual method to TargetInstrInfo to allow each target decide which strategy to use. Depends on D140541 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D140542	2023-02-17 13:17:22 +03:00
Anton Sidorenko	5bdd0beeee	[MachineCombiner][NFC] Rename `MinInstr` to `TraceEnsemble` We are about to allow different trace strategies for MachineCombiner. Make the name of the ensemble strategy-neutral. Depends on D140540 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D140541	2023-02-16 15:09:02 +03:00
Anton Sidorenko	77bd15ae2f	[MachineTraceMetrics][NFC] Move Strategy enum out of the class Make forward declaration possible to reduce amount of dependencies and reduce re-compilation burden caused by further patches. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D140539	2023-02-14 16:38:47 +03:00
Philip Reames	86eff6be68	[MachineCombiner] Use default latency model when no detailed model available This change adjusts the cost modeling used when the target does not have a schedule model with individual instruction latencies. After this change, we use the default latency information available from TargetSchedule. The default latency information essentially ends up treating most instructions as latency 1, with a few "expensive" ones getting a higher cost. Previously, we unconditionally applied the first legal pattern - without any consideration of profitability. As a result, this change both prevents some patterns being applied, and changes which patterns are exercised. (i.e. previously the first pattern was applied, afterwards, maybe the second one is because the first wasn't profitable.) The motivation here is two fold. First, this brings the default behavior in line with the behavior when -mcpu or -mtune is specified. This improves test coverage, and generally makes it less likely we will have bad surprises when providing more information to the compiler. Second, this enables some reassociation for ILP by default. Despite being unconditionally enabled, the prior code tended to "reassociate" repeatedly through an entire chain and simply moving the first operand to the end. The result was still a serial chain, just a different one. With this change, one of the intermediate transforms is unprofitable and we end up with a partially flattened tree. Note that the resulting code diffs show significant room for improvement in the basic algorithm. I am intentionally excluding those from this patch. For the test diffs, I don't seen any concerning regressions. I took a fairly close look at the RISCV ones, but only skimmed the x86 (particularly vector x86) changes. Differential Revision: https://reviews.llvm.org/D141017	2023-01-20 09:28:20 -08:00
Craig Topper	e72ca520bb	[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715	2023-01-13 14:38:08 -08:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Philip Reames	9560ac3a25	[MachineCombine] Reorganize code for readability and tracing [nfc]	2023-01-04 10:47:39 -08:00
Anton Sidorenko	b6c790736e	[MachineCombiner][RISCV] Add fmadd/fmsub/fnmsub instructions patterns This patch adds tranformation of fmul+fadd/fsub chains to fused multiply instructions: * fmul+fadd->fmadd * fmul+fsub->fmsub/fnmsub We also will try to combine these instructions if the fmul has more than one use and cannot be deleted. However, removing the dependence between fmul and fadd can still be profitable, and we rely on machine combiner approximations of scheduling. Differential Revision: https://reviews.llvm.org/D136764	2022-11-17 13:24:04 +03:00
Anton Sidorenko	4431e705cc	[NFC] Use forward decl of MachineCombinerPattern enum to reduce dependencies Differential Revision: https://reviews.llvm.org/D135776	2022-10-13 14:56:14 +01:00
Kazu Hirata	9e6d1f4b5d	[CodeGen] Qualify auto variables in for loops (NFC)	2022-07-17 01:33:28 -07:00
Guozhi Wei	2f11b3a6d7	[MachineCombiner] Don't compute the latency of transient instructions If an MI will not generate a target instruction, we should not compute its latency. Then we can compute more precise instruction sequence cost, and get better result. Differential Revision: https://reviews.llvm.org/D129615	2022-07-14 17:08:14 +00:00
Guozhi Wei	ddc9e8861c	[MachineCombiner, AArch64] Add a new pattern A-(B+C) => (A-B)-C to reduce latency Add a new pattern A - (B + C) ==> (A - B) - C to give machine combiner a chance to evaluate which instruction sequence has lower latency. Differential Revision: https://reviews.llvm.org/D124564	2022-06-28 21:42:51 +00:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Mircea Trofin	b012742405	[NFC] Rename MachineFunction::deleteMachineInstr (coding style)	2021-12-08 20:36:13 -08:00
Jack Andersen	f108c7f59d	[GlobalISel] Allow DBG_VALUE to use undefined vregs before LiveDebugValues. Expanding on D109750. Since `DBG_VALUE` instructions have final register validity determined in `LDVImpl::handleDebugValue`, there is no apparent reason to immediately prune unused register operands as their defs are erased. Consequently, this renders `MachineInstr::eraseFromParentAndMarkDBGValuesForRemoval` moot; gaining a substantial performance improvement. The only necessary changes involve making relevant passes consider invalid DBG_VALUE vregs uses as valid. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D112852	2021-12-05 15:55:59 -05:00
Chen Zheng	0ed4cf4bf3	[PowerPC] support register pressure reduction in machine combiner. Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071	2021-01-24 21:28:21 -05:00
Tres Popp	3bd24574c7	Revert "[PowerPC] support register pressure reduction in machine combiner." This reverts commit 26a396c4ef481cb159bba631982841736a125a9c. See https://reviews.llvm.org/D92071 for a description of the issue.	2021-01-18 12:01:57 +01:00
Chen Zheng	26a396c4ef	[PowerPC] support register pressure reduction in machine combiner. Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071	2021-01-17 23:56:13 -05:00
Chen Zheng	4830d458dd	[MachineCombiner][NFC] Add MustReduceRegisterPressure goal add a new goal MustReduceRegisterPressure for machine combiner pass. PowerPC will use this new goal to do some register pressure related optimization. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D92068	2020-12-14 00:02:42 -05:00
Chen Zheng	bd7096b977	[PowerPC] fma chain break to expose more ILP This patch tries to reassociate two patterns related to FMA to expose more ILP on PowerPC. // Pattern 1: // A = FADD X, Y (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMA X, M21, M22 // B = FMA Y, M31, M32 // C = FADD A, B // Pattern 2: // A = FMA X, M11, M12 (Leaf) // B = FMA A, M21, M22 (Prev) // C = FMA B, M31, M32 (Root) // --> // A = FMUL M11, M12 // B = FMA X, M21, M22 // D = FMA A, M31, M32 // C = FADD B, D Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D80175	2020-06-15 00:00:04 -04:00
Chen Zheng	2a24d350db	[MachineCombine] add a hook for resource length limit	2020-05-31 23:21:04 -04:00
Hiroshi Yamauchi	d9ae493937	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149	2019-12-09 12:42:59 -08:00
Hiroshi Yamauchi	2eb30fafa5	Revert "[PGO][PGSO] Instrument the code gen / target passes." This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac. This seems to break buildbots.	2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi	9a0b5e1407	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072	2019-12-06 10:43:39 -08:00
Reid Kleckner	05da2fe521	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211	2019-11-13 16:34:37 -08:00
Daniel Sanders	2bea69bf65	Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC llvm-svn: 367633	2019-08-01 23:27:28 +00:00
Craig Topper	78c794a70b	[X86] Fix several places that weren't passing what they though they were to MachineInstr::print Over a year ago, MachineInstr gained a fourth boolean parameter that occurs before the TII pointer. When this happened, several places started accidentally passing TII into this boolean parameter instead of the TII parameter. llvm-svn: 362312	2019-06-02 01:36:48 +00:00
Evandro Menezes	85bd3978ae	[IR] Refactor attribute methods in Function class (NFC) Rename the functions that query the optimization kind attributes. Differential revision: https://reviews.llvm.org/D60287 llvm-svn: 357731	2019-04-04 22:40:06 +00:00
Andrea Di Biagio	edbf06a767	[AsmPrinter] Remove hidden flag -print-schedule. This patch removes hidden codegen flag -print-schedule effectively reverting the logic originally committed as r300311 (https://llvm.org/viewvc/llvm-project?view=revision&revision=300311). Flag -print-schedule was originally introduced by r300311 to address PR32216 (https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better testing of schedule model instruction latencies/throughputs". These days, we can use llvm-mca to test scheduling models. So there is no longer a need for flag -print-schedule in LLVM. The main use case for PR32216 is now addressed by llvm-mca. Flag -print-schedule is mainly used for debugging purposes, and it is only actually used by x86 specific tests. We already have extensive (latency and throughput) tests under "test/tools/llvm-mca" for X86 processor models. That means, most (if not all) existing -print-schedule tests for X86 are redundant. When flag -print-schedule was first added to LLVM, several files had to be modified; a few APIs gained new arguments (see for example method MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained a couple of getSchedInfoStr() methods. Method getSchedInfoStr() had to originally work for both MCInst and MachineInstr. The original implmentation of getSchedInfoStr() introduced a subtle layering violation (reported as PR37160 and then fixed/worked-around by r330615). In retrospect, that new API could have been designed more optimally. We can always query MCSchedModel to get the latency and throughput. More importantly, the "sched-info" string should not have been generated by the subtarget. Note, r317782 fixed an issue where "print-schedule" didn't work very well in the presence of inline assembly. That commit is also reverted by this change. Differential Revision: https://reviews.llvm.org/D57244 llvm-svn: 353043	2019-02-04 12:51:26 +00:00
Chandler Carruth	2946cd7010	Update the file headers across all of the LLVM projects in the monorepo to reflect the new license. We understand that people may be surprised that we're moving the header entirely to discuss the new license. We checked this carefully with the Foundation's lawyer and we believe this is the correct approach. Essentially, all code in the project is now made available by the LLVM project under our new license, so you will see that the license headers include that license only. Some of our contributors have contributed code under our old license, and accordingly, we have retained a copy of our old license notice in the top-level files in each project and repository. llvm-svn: 351636	2019-01-19 08:50:56 +00:00
Gerolf Hoflehner	cb7d968f73	[MachineCombiner][NFC] Prevent dereferencing past-the-end object in an MRI container llvm-svn: 350896	2019-01-10 21:53:13 +00:00
Nicola Zaghen	d34e60ca85	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 llvm-svn: 332240	2018-05-14 12:53:11 +00:00
Sanjay Patel	0d7df36c66	[TargetSchedule] shrink interface for init(); NFCI The TargetSchedModel is always initialized using the TargetSubtargetInfo's MCSchedModel and TargetInstrInfo, so we don't need to extract those and pass 3 parameters to init(). Differential Revision: https://reviews.llvm.org/D44789 llvm-svn: 329540	2018-04-08 19:56:04 +00:00
Reid Kleckner	2aeb930a9f	Revert r327721 "This patch fixes the invalid usage of OptSize in Machine Combiner." It causes asserts when compiling Chromium on Win32 with optimizations. We compile many things with -Os. llvm-svn: 327733	2018-03-16 20:11:55 +00:00
Andrew V. Tischenko	a0cd09d4a2	This patch fixes the invalid usage of OptSize in Machine Combiner. Differential Revision: https://reviews.llvm.org/D43813 llvm-svn: 327721	2018-03-16 16:06:24 +00:00
Andrew V. Tischenko	083891925b	The final step to close D41278 [MachineCombiner] Improve debug output (NFC). Differential Revision: https://reviews.llvm.org/D41278 llvm-svn: 326074	2018-02-26 09:43:21 +00:00
Andrew V. Tischenko	b65b078d4d	(NFC)[MachineCombiner] Improve debug output. llvm-svn: 325217	2018-02-15 07:55:02 +00:00
Alexander Ivchenko	6805004cb1	Fix unused variable warning in release mode. NFC. llvm-svn: 324330	2018-02-06 09:53:02 +00:00
Florian Hahn	c68428b5dc	[MachineCombiner] Add check for optimal pattern order. In D41587, @mssimpso discovered that the order of some patterns for AArch64 was sub-optimal. I thought a bit about how we could avoid that case in the future. I do not think there is a need for evaluating all patterns for now. But this patch adds an extra (expensive) check, that evaluates the latencies of all patterns, and ensures that the latency saved decreases for subsequent patterns. This catches the sub-optimal order fixed in D41587, but I am not entirely happy with the check, as it only applies to sub-optimal patterns seen while building with EXPENSIVE_CHECKS on. It did not discover any other sub-optimal pattern ordering. Reviewers: Gerolf, spatel, mssimpso Reviewed By: Gerolf, mssimpso Differential Revision: https://reviews.llvm.org/D41766 llvm-svn: 323873	2018-01-31 13:54:30 +00:00
Matthias Braun	f1caa2833f	MachineFunction: Return reference from getFunction(); NFC The Function can never be nullptr so we can return a reference. llvm-svn: 320884	2017-12-15 22:22:58 +00:00
Michael Zolotukhin	c468b648fd	Remove redundant includes from lib/CodeGen. llvm-svn: 320619	2017-12-13 21:30:47 +00:00
Florian Hahn	001c3dd202	[MachineCombiner] Add up latencies of all instructions in new pattern. Summary: When calculating the RootLatency, we add up all the latencies of the deleted instructions. But for NewRootLatency we only add the latency of the new root instructions, ignoring the latencies of the other instructions inserted. This leads the combiner to underestimate the cost of patterns which add multiple instructions. This patch fixes that by summing up the latencies of all new instructions. For NewRootNode, the more complex getLatency function is used. Note that we may be slightly more precise than just summing up all latencies. For example, consider a pattern like r1 = INS1 .. r2 = INS2 .. r3 = INS3 r1, r2 I think in some other places, the total latency of the pattern would be estimated as lat(INS3) + max(lat(INS1), lat(INS2)). If you consider that worth changing, I think it would be best to do in a follow-up patch. Reviewers: Gerolf, sebpop, spop, fhahn Reviewed By: fhahn Subscribers: evandro, llvm-commits Differential Revision: https://reviews.llvm.org/D40307 llvm-svn: 319951	2017-12-06 20:27:33 +00:00
David Blaikie	b3bde2ea50	Fix a bunch more layering of CodeGen headers that are in Target All these headers already depend on CodeGen headers so moving them into CodeGen fixes the layering (since CodeGen depends on Target, not the other way around). llvm-svn: 318490	2017-11-17 01:07:10 +00:00
David Blaikie	3f833edc7c	Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering This header includes CodeGen headers, and is not, itself, included by any Target headers, so move it into CodeGen to match the layering of its implementation. llvm-svn: 317647	2017-11-08 01:01:31 +00:00

1 2

100 Commits