llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-05 10:06:08 +00:00

Author	SHA1	Message	Date
HaohaiWen	536b043219	[RegAllocFast] Lazily initialize InstrPosIndexes for each MBB (#76275 ) Most basic block do not need to query dominates. Defer initialization of InstrPosIndexes to first query for each MBB.	2023-12-25 09:42:31 +08:00
Nikita Popov	d82eccc752	[RegAllocFast] Avoid duplicate hash lookup (NFC)	2023-12-22 16:52:20 +01:00
HaohaiWen	40ec791b15	[RegAllocFast] Refactor dominates algorithm for large basic block (#72250 ) The original brute force dominates algorithm is O(n) complexity so it is very slow for very large machine basic block which is very common with O0. This patch added InstrPosIndexes to assign index for each instruction and use it to determine dominance. The complexity is now O(1).	2023-12-22 23:06:16 +08:00
Nick Desaulniers	935c6a2d8d	[RegAllocFast] NFC cleanups (#74860 ) - use more range for - avoid capturing lambda - prefer Register type to unsigned - remove braces around single statement if	2023-12-12 08:58:58 -08:00
HaohaiWen	a908920201	[NFC][CodeGen] clang-format RegAllocFast.cpp (#72199 )	2023-11-14 12:57:02 +08:00
Elliot Goodrich	4d0f1e3282	[llvm] Remove SmallSet from MachineInstr.h `MachineInstr.h` is a commonly included file and this includes `llvm/ADT/SmallSet.h` for one function `getUsedDebugRegs()`, which is used only in one place. According to `ClangBuildAnalyzer` (run solely on building LLVM, no other projects) the second most expensive template to instantiate is the `SmallSet::insert` method used in the `inline` implementation in `getUsedDebugRegs()`: ``` **** Templates that took longest to instantiate: 554239 ms: std::unordered_map<int, int> (2826 times, avg 196 ms) 521187 ms: llvm::SmallSet<llvm::Register, 4>::insert (930 times, avg 560 ms) ... ``` By removing this method and putting its implementation in the one call site we greatly reduce the template instantiation time and reduce the number of includes. When copying the implementation, I removed a check on `MO.getReg()` as this is checked within `MO.isVirtual()`. Differential Revision: https://reviews.llvm.org/D157720	2023-08-12 18:15:27 +01:00
Qi Hu	ddd7d35c6c	[RegAlloc] Fix assertion failure caused by inline assembly When inline assembly code requests more registers than available, the MachineInstr::emitError function in the RegAllocFast pass emits an error but doesn't stop the pass, and then the compiler crashes later with an assertion failure. This commit, mimicking the RegAllocGreedy pass, assigns a random physical register, and therefore avoids the crash after producing the diagnostic. This problem has been observed for both rustc and clang, while it doesn't occur in gcc.	2023-07-25 19:21:03 -04:00
Jay Foad	da7892f729	[MC] Use regunits instead of MCRegUnitIterator. NFC. Differential Revision: https://reviews.llvm.org/D153122	2023-06-16 12:21:32 +01:00
Sergei Barannikov	aa2d0fbc30	[MC] Add MCRegisterInfo::regunits for iteration over register units Reviewed By: foad Differential Revision: https://reviews.llvm.org/D152098	2023-06-16 05:39:50 +03:00
Jay Foad	5022fc2ad3	[CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI. Differential Revision: https://reviews.llvm.org/D151424	2023-06-01 19:17:34 +01:00
Gaëtan Bossu	c4a872badb	FastRegAlloc: Fix implicit operands not rewritten This patch fixes a potential crash due to RegAllocFast not rewriting virtual registers. This essentially happens because of a call to MachineInstr::addRegisterKilled() in the process of allocating a "killed" vreg. The former can eventually delete implicit operands without RegAllocFast noticing, leading to some operands being "skipped" and not rewritten to use physical registers. Note that I noticed this crash when working on a solution for tying a register with one/multiple of its sub-registers within an instruction. (See problem description here: https://discourse.llvm.org/t/pass-to-tie-an-output-operand-to-a-subregister-of-an-input-operand/67184). Aside from this fix, I believe there could be further improvements to the RegAllocFast when it comes to instructions with multiple uses of a same virtual register. You can see it in the added test where the implicit uses have been re-written in a somewhat surprising way because of phase ordering. Ultimately, when allocating vregs for an instruction, I believe we should iterate on the vregs it uses (and then process all the operands that use this vregs), instead of directly iterating on operands and somewhat assuming each operand uses a different vreg. This would in the end be quite close to what greedy+virtregrewriter does. If that makes sense, I would probably spin off another patch (after I get more familiar with RegAllocFast). Differential Revision: https://reviews.llvm.org/D145169	2023-05-16 09:49:20 +02:00
Alexis Engelke	1e743732e7	[RegAllocFast] Use uint16_t SparseT for LiveRegMap For functions with very large numbers of live variables, lookups into LiveRegMap previously detoriated to linear searches. This slightly increases memory usage, but that is barely measurable. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D149330	2023-04-27 18:58:49 +02:00
Akshay Khadse	8bf7f86d79	Fix uninitialized pointer members in CodeGen This change initializes the members TSI, LI, DT, PSI, and ORE pointer feilds of the SelectOptimize class to nullptr. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D148303	2023-04-17 16:32:46 +08:00
Matt Arsenault	7907fd4961	RegAllocFast: Fix dropping subreg indexes on unassigned subreg defs This was assuming all register operands were assigned to physical registers. This should ignore the operands which weren't assigned in this run. Fixes #61134	2023-04-05 18:25:51 -04:00
Nick Desaulniers	9cec2b246e	[RegAllocFast] insert additional spills along indirect edges of INLINEASM_BR When generating spills (stores) for values produced by INLINEASM_BR instructions, make sure to insert one spill per indirect target. Otherwise the reload generated may load from a stack slot that has not yet been stored to (resulting in a load of an uninitialized stack slot). Link: https://github.com/llvm/llvm-project/issues/53562 Fixes: https://github.com/llvm/llvm-project/issues/60855 Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D144907	2023-03-01 15:21:11 -08:00
Craig Topper	e72ca520bb	[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715	2023-01-13 14:38:08 -08:00
Josh Stone	87f57f459e	[RegAllocFast] Handle new debug values for spills These new debug values get inserted after the place where the spill happens, which means they won't be reached by the reverse traversal of basic block instructions. This would crash or fail assertions if they contained any virtual registers to be replaced. We can manually handle the new debug values right away to resolve this. Fixes https://github.com/llvm/llvm-project/issues/59172 Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D139590	2023-01-05 20:41:11 -08:00
Christudasan Devadasan	b5efec4b27	[CodeGen] Additional Register argument to storeRegToStackSlot/loadRegFromStackSlot With D134950, targets get notified when a virtual register is created and/or cloned. Targets can do the needful with the delegate callback. AMDGPU propagates the virtual register flags maintained in the target file itself. They are useful to identify a certain type of machine operands while inserting spill stores and reloads. Since RegAllocFast spills the physical register itself, there is no way its virtual register can be mapped back to retrieve the flags. It can be solved by passing the virtual register as an additional argument. This argument has no use when the spill interfaces are called during the greedy allocator or even the PrologEpilogInserter and can pass a null register in such cases. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138656	2022-12-17 11:55:34 +05:30
Serguei Katkov	d330731f94	[RegAllocFast] Clean-up. Remove redundant operations. NFC. Reviewed By: MatzeB, arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D109213	2022-10-05 11:38:54 +07:00
Luo, Yuanke	5159be3c9b	(Reland) [fastalloc] Support allocating specific register class in fastalloc This reverts commit 853bb192c407f5d9e75a5fd55cc089151530cbd3.	2022-08-20 13:25:34 +08:00
Luo, Yuanke	853bb192c4	Revert "(Reland) [fastalloc] Support allocating specific register class in fastalloc" This reverts commit 30f9e6ebd30b79d13f99eaca4d829e0da07186b3.	2022-08-15 20:33:15 +08:00
Luo, Yuanke	30f9e6ebd3	(Reland) [fastalloc] Support allocating specific register class in fastalloc Reland commit 719658d078c4 The base RA support infrastructure that only allow a specific register class be allocated in RA pss. Since greedy RA, basic RA derived from base RA, they all allow allocating specific register class. Fast RA doesn't support allocating register for specific register class. This patch is to enable ShouldAllocateClass in fast RA, so that it can support allocating register for specific register class. Differential Revision: https://reviews.llvm.org/D131825	2022-08-13 13:57:34 +08:00
Kazu Hirata	9e6d1f4b5d	[CodeGen] Qualify auto variables in for loops (NFC)	2022-07-17 01:33:28 -07:00
Kazu Hirata	4d9d07c5fb	[CodeGen] Use RegClassFilterFunc where appropriate (NFC)	2022-07-16 15:43:33 -07:00
Nico Weber	851a5efe45	Revert "[fastalloc] Support allocating specific register class in fastalloc" This reverts commit 719658d078c4093d1ee716fb65ae94673df7b22b. Breaks a few things, see comments on https://reviews.llvm.org/D128437 There's disagreement about the best fix. So let's keep HEAD green while discussions are happening.	2022-06-23 10:44:24 -04:00
Luo, Yuanke	719658d078	[fastalloc] Support allocating specific register class in fastalloc The base RA support infrastructure that only allow a specific register class be allocated in RA pss. Since greedy RA, basic RA derived from base RA, they all allow allocating specific register class. Fast RA doesn't support allocating register for specific register class. This patch is to enable ShouldAllocateClass in fast RA, so that it can support allocating register for specific register class. Differential Revision: https://reviews.llvm.org/D126771	2022-06-23 14:42:04 +08:00
Luo, Yuanke	44e8a205f4	[fastregalloc] Enhance the heuristics for liveout in self loop. For below case, virtual register is defined twice in the self loop. We don't need to spill %0 after the third instruction `%0 = def (tied %0)`, because it is defined in the second instruction `%0 = def`. 1 bb.1 2 %0 = def 3 %0 = def (tied %0) 4 ... 5 jmp bb.1 Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D125079	2022-06-21 09:18:49 +08:00
Luo, Yuanke	764676b737	[fastregalloc] Fix bug when undef value is tied to def. If the tied use is undef value, fastregalloc should free the def register. There is no reload needed for the undef value. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D124834	2022-05-04 12:12:55 +08:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Kazu Hirata	c73fc74ce0	[llvm] Use range-based for loops (NFC)	2021-11-28 10:04:54 -08:00
Ilya Yanok	3c47c5ca13	[RegAllocFast] Fix nondeterminism in debuginfo generation Changes from commit 1db137b1859692ae33228c530d4df9f2431b2151 added iteration over hash map that can result in non-deterministic order. Fix that by using a SmallMapVector to preserve the order. Differential Revision: https://reviews.llvm.org/D113468	2021-11-09 21:42:50 +01:00
Matt Arsenault	eebe841a47	RegAlloc: Allow targets to split register allocation AMDGPU normally spills SGPRs to VGPRs. Previously, since all register classes are handled at the same time, this was problematic. We don't know ahead of time how many registers will be needed to be reserved to handle the spilling. If no VGPRs were left for spilling, we would have to try to spill to memory. If the spilled SGPRs were required for exec mask manipulation, it is highly problematic because the lanes active at the point of spill are not necessarily the same as at the restore point. Avoid this problem by fully allocating SGPRs in a separate regalloc run from VGPRs. This way we know the exact number of VGPRs needed, and can reserve them for a second run. This fixes the most serious issues, but it is still possible using inline asm to make all VGPRs unavailable. Start erroring in the case where we ever would require memory for an SGPR spill. This is implemented by giving each regalloc pass a callback which reports if a register class should be handled or not. A few passes need some small changes to deal with leftover virtual registers. In the AMDGPU implementation, a new pass is introduced to take the place of PrologEpilogInserter for SGPR spills emitted during the first run. One disadvantage of this is currently StackSlotColoring is no longer used for SGPR spills. It would need to be run again, which will require more work. Error if the standard -regalloc option is used. Introduce new separate -sgpr-regalloc and -vgpr-regalloc flags, so the two runs can be controlled individually. PBQB is not currently supported, so this also prevents using the unhandled allocator.	2021-07-13 18:49:29 -04:00
Tim Northover	c1dc267258	MachineBasicBlock: add liveout iterator aware of which liveins are defined by the runtime. Using this in RegAlloc fast reduces register pressure, and in some cases allows x86 code to compile that wouldn't before.	2021-05-19 11:00:24 +01:00
Denis Antrushin	df47368d40	[RegAllocFast] properly handle STATEPOINT instruction. STATEPOINT is a fancy and complex pseudo instruction which has both tied defs and regmask operand. Basic FastRA algorithm is as follows: 1. Mark registers used by defs as free 2. If instruction has regmask operand displace clobbered registers according to regmask. 3. Assign registers for use operands. In case of tied defs step 1 is replaced with allocation of registers for them. But regmask is still processed, which may displace already allocated registers. As a result, tied use and def will get assigned to different registers. This patch makes FastRA to process instruction's RegMask (if any) when checking for physical registers interference. That way tied operands won't get registers clobbered by regmask. Reviewed By: arsenm, skatkov Differential Revision: https://reviews.llvm.org/D99284	2021-05-11 17:27:00 +07:00
Tim Northover	c1b7460b5b	Revert "RegAlloc: do not consider liveins to EH-pad successors as liveout." Some liveins can come from this block (e.g. any SSA value except the call), it's only the ones that produce `landingpad` values that can't and I didn't think it through properly.	2021-04-29 20:00:07 +01:00
Tim Northover	438a63e13b	RegAlloc: do not consider liveins to EH-pad successors as liveout. These registers get defined by the runtime, not the block being allocated, and treating them as preassigned in RegAllocFast adds extra pressure, sometimes enough to make the function unallocatable.	2021-04-29 19:34:49 +01:00
Stephen Tozer	1db137b185	[DebugInfo] Handle DBG_VALUES with multiple variable location operands in MIR This patch adds handling for DBG_VALUE_LIST in the MIR-passes (after finalize-isel), excluding the debug liveness passes and DWARF emission. This most significantly affects MachineSink, which now needs to consider all used registers of a debug value when sinking, but for most passes this change is simply replacing getDebugOperand(0) with an iteration over all debug operands. Differential Revision: https://reviews.llvm.org/D92578	2021-03-10 17:15:24 +00:00
Stephen Tozer	f677413071	Reapply "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values" Rewrites test to use correct architecture triple; fixes incorrect reference in SourceLevelDebugging doc; simplifies `spillReg` behaviour so as to not be dependent on changes elsewhere in the patch stack. This reverts commit d2000b45d033c06dc7973f59909a0ad12887ff51.	2021-03-05 12:32:05 +00:00
Stephen Tozer	d2000b45d0	Revert "[DebugInfo] Add new instruction and DIExpression operator for variadic debug values" This reverts commit d07f106f4a48b6e941266525b6f7177834d7b74e.	2021-03-04 11:59:21 +00:00
gbtozers	d07f106f4a	[DebugInfo] Add new instruction and DIExpression operator for variadic debug values This patch adds a new instruction that can represent variadic debug values, DBG_VALUE_VAR. This patch alone covers the addition of the instruction and a set of basic code changes in MachineInstr and a few adjacent areas, but does not correctly handle variadic debug values outside of these areas, nor does it generate them at any point. The new instruction is similar to the existing DBG_VALUE instruction, with the following differences: the operands are in a different order, any number of values may be used in the instruction following the Variable and Expression operands (these are referred to in code as “debug operands”) and are indexed from 0 so that getDebugOperand(X) == getOperand(X+2), and the Expression in a DBG_VALUE_VAR must use the DW_OP_LLVM_arg operator to pass arguments into the expression. The new DW_OP_LLVM_arg operator is only valid in expressions appearing in a DBG_VALUE_VAR; it takes a single argument and pushes the debug operand at the index given by the argument onto the Expression stack. For example the sub-expression `DW_OP_LLVM_arg, 0` has the meaning “Push the debug operand at index 0 onto the expression stack.” Differential Revision: https://reviews.llvm.org/D82363	2021-03-04 11:45:35 +00:00
Kazu Hirata	61efa3d93f	[CodeGen] Use range-based for loops (NFC)	2021-02-17 23:58:46 -08:00
Kazu Hirata	9bcc0d1040	[CodeGen, Transforms] Use llvm::sort (NFC)	2021-01-14 20:30:31 -08:00
Pushpinder Singh	e2303a448e	[FastRA] Fix handling of bundled MIs Fast register allocator skips bundled MIs, as the main assignment loop uses MachineBasicBlock::iterator (= MachineInstrBundleIterator) This was causing SIInsertWaitcnts to crash which expects all instructions to have registers assigned. This patch makes sure to set everything inside bundle to the same assignments done on BUNDLE header. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D90369	2020-12-21 02:10:55 -05:00
Mircea Trofin	bab72dd5d5	[NFC][MC] TargetRegisterInfo::getSubReg is a MCRegister. Typing the API appropriately. Differential Revision: https://reviews.llvm.org/D92341	2020-12-02 15:46:38 -08:00
Mircea Trofin	61e8a44655	[NFC][regalloc] Use MCRegister appropriately Differential Revision: https://reviews.llvm.org/D90506	2020-11-02 11:48:49 -08:00
Matt Arsenault	b9c21d43bb	RegAlloc: Clear isSSA The MIR parser may infer SSA, so -run-pass=regallocgreedy would hit a verifier error after multiple vreg defs are added.	2020-10-28 12:02:16 -04:00
Mehdi Amini	8f492f6467	Remove unused verifyRegStateMapping() function in RegAllocFast (NFC) This fixes compiler warning when building with assertions.	2020-10-24 00:36:51 +00:00
Matt Arsenault	a66fca44ac	RegAllocFast: Add extra DBG_VALUE for live out spills This allows LiveDebugValues to insert the proper DBG_VALUEs in live out blocks if a spill is inserted before the use of a register. Previously, this would see the register use as the last DBG_VALUE, even though the stack slot should be treated as the live out value. This avoids an lldb test regression when D52010 is re-applied.	2020-09-30 10:35:25 -04:00

1 2 3 4 5 ...

268 Commits