213 Commits

Author SHA1 Message Date
Kazu Hirata
735ab61ac8
[CodeGen] Remove unused includes (NFC) (#115996)
Identified with misc-include-cleaner.
2024-11-12 23:15:06 -08:00
Kazu Hirata
312c1cfbd1
[CodeGen] Avoid repeated hash lookups (NFC) (#110203) 2024-09-28 10:04:44 -07:00
Craig Topper
80f6b42a26
[MachinePipeliner] Fix incorrect use of getPressureSets. (#109179)
The code was passing a physical register directly to getPressureSets
which expects a register unit.

Fix this by looping over the register units and calling getPressureSets
for each of them.

Found while trying to add a RegisterUnit class to stop storing register
units in `Register`. 0 is a valid register unit but not a valid
Register.
2024-09-18 21:34:05 -07:00
Michael Marjieh
00c198b2ca
[MachinePipeliner] Make Recurrence MII More Accurate (#105475)
Current RecMII calculation is bigger than it needs to be. The
calculation was refined in this patch.
2024-09-03 16:15:17 +09:00
Ryotaro KASUGA
1745c8e08d
[MachinePipeliner] Fix instruction order with physical register (#99264)
dependencies in same cycle

Dependency checks were insufficient when reordering instructions with
physical register dependencies (i.e. Anti/Output dependencies). This
could result in generating incorrect code.
2024-08-06 13:46:10 +09:00
Kai Yan
cd1a2ede2f
[llvm][CodeGen] Added a new restriction for II by pragma in window scheduler (#99448)
Added a new restriction for window scheduling.
Window scheduling is disabled when llvm.loop.pipeline.initiationinterval
is set.
2024-07-24 12:11:58 +08:00
Kazu Hirata
66cd2e0f9a
[CodeGen] Use range-based for loops (NFC) (#98706) 2024-07-13 13:29:47 -07:00
paperchalice
abde52aa66
[CodeGen][NewPM] Port LiveIntervals to new pass manager (#98118)
- Add `LiveIntervalsAnalysis`.
- Add `LiveIntervalsPrinterPass`.
- Use `LiveIntervalsWrapperPass` in legacy pass manager.
- Use `std::unique_ptr` instead of raw pointer for `LICalc`, so
destructor and default move constructor can handle it correctly.

This would be the last analysis required by `PHIElimination`.
2024-07-10 19:34:48 +08:00
paperchalice
79d0de2ac3
[CodeGen][NewPM] Port machine-loops to new pass manager (#97793)
- Add `MachineLoopAnalysis`.
- Add `MachineLoopPrinterPass`.
- Convert to `MachineLoopInfoWrapperPass` in legacy pass manager.
2024-07-09 09:11:18 +08:00
Ryotaro KASUGA
0a369b06e3
Reapply "[MachinePipeliner] Fix constraints aren't considered in cert… (#97259)
…ain cases" (#97246)

This reverts commit e6a961dbef773b16bda2cebc4bf9f3d1e0da42fc.

There is no difference from the original change. I re-ran the failed
test and it passed. So the failure wasn't caused by this change.
test result: https://lab.llvm.org/buildbot/#/builders/176/builds/585
2024-07-03 09:15:41 +09:00
Ryotaro KASUGA
e6a961dbef
Revert "[MachinePipeliner] Fix constraints aren't considered in certain cases" (#97246)
Reverts llvm/llvm-project#95356

Due to ppc64le test failures caught by the LLVM Buildbot.
https://lab.llvm.org/buildbot/#/builders/176/builds/576
2024-07-01 10:32:58 +09:00
Ryotaro KASUGA
e19ac0dcfd
[MachinePipeliner] Fix constraints aren't considered in certain cases (#95356)
when scheduling

When scheduling an instruction, if both any predecessors and any
successors of the instruction are already scheduled, `SchedStart` isn't
taken into account. It may result generating incorrect code. This patch
fixes the problem. Also, this patch merges `SchedStart` into
`EarlyStart` (same for `SchedEnd`).

Fixes https://github.com/llvm/llvm-project/issues/93936
2024-07-01 09:07:32 +09:00
Hua Tian
b6bf4024a0
[llvm][CodeGen] Add a new software pipeliner 'Window Scheduler' (#84443)
This commit implements the Window Scheduler as described in the RFC:

https://discourse.llvm.org/t/rfc-window-scheduling-algorithm-for-machinepipeliner-in-llvm/74718

This Window Scheduler implements the window algorithm designed by
Steven Muchnick in the book "Advanced Compiler Design And
Implementation",
with some improvements:

1. Copy 3 times of the loop kernel and construct the corresponding DAG
   to identify dependencies between MIs;
2. Use heuristic algorithm to obtain a set of window offsets.

The window algorithm is equivalent to modulo scheduling algorithm with a
stage of 2. It is mainly applied in targets where hardware resource
conflicts are severe, and the SMS algorithm often fails in such cases.
On our own DSA, this window algorithm typically can achieve a
performance
improvement of over 10%.

Co-authored-by: Kai Yan <aklkaiyan@tencent.com>
Co-authored-by: Ran Xiao <lennyxiao@tencent.com>

---------

Co-authored-by: Kai Yan <aklkaiyan@tencent.com>
Co-authored-by: Ran Xiao <lennyxiao@tencent.com>
2024-06-13 17:51:09 +08:00
Yuta Mukai
0c5319e546
[ModuloSchedule][AArch64] Implement modulo variable expansion for pipelining (#65609)
Modulo variable expansion is a technique that resolves overlap of
variable lifetimes by unrolling. The existing implementation solves it
by making a copy by move instruction for processors with ordinary
registers such as Arm and x86. This method may result in a very large
number of move instructions, which can cause performance problems.

Modulo variable expansion is enabled by specifying -pipeliner-mve-cg. A
backend must implement some newly defined interfaces in
PipelinerLoopInfo. They were implemented for AArch64.

Discourse thread:
https://discourse.llvm.org/t/implementing-modulo-variable-expansion-for-machinepipeliner
2024-06-12 10:27:35 +09:00
paperchalice
837dc542b1
[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-11 21:27:14 +08:00
David Green
b24af43fdf
[AArch64] Improve scheduling latency into Bundles (#86310)
By default the scheduling info of instructions into a BUNDLE are given a
latency of 0 as they operate on the implicit register of the bundle.
This modifies that for AArch64 so that the latency is adjusted to use
the latency from the instruction in the bundle instead. This essentially
assumes that the bundled instructions are executed in a single cycle,
which for AArch64 is probably OK considering they are mostly used for
MOVPFX bundles, where this can help create slightly better scheduling
especially for in-order cores.
2024-04-12 10:57:01 +01:00
Malay Sanghi
38f996bb2b
Replace copy with a reference. (#87975) 2024-04-08 20:31:51 +08:00
Ryotaro KASUGA
ea4a11926b
Reapply "[CodeGen] Fix register pressure computation in MachinePipeli… (#87312)
…ner (#87030)"

Fix broken test.

This reverts commit b8ead2198f27924f91b90b6c104c1234ccc8972e.
2024-04-03 09:28:09 +09:00
Gulfem Savrun Yeniceri
b8ead2198f Revert "[CodeGen] Fix register pressure computation in MachinePipeliner (#87030)"
This reverts commit a4dec9d6bc67c4d8fbd4a4f54ffaa0399def9627
because the test failed in the following builder:
https://luci-milo.appspot.com/ui/p/fuchsia/builders/prod/clang-linux-x64/b8751864477467126481/overview
2024-04-01 18:27:41 +00:00
Ryotaro KASUGA
a4dec9d6bc
[CodeGen] Fix register pressure computation in MachinePipeliner (#87030)
`RegisterClassInfo::getRegPressureSetLimit` has been changed to return a
smaller value than before so the limit may become negative in later
calculations. As a workaround, change to use
`TargetRegisterInfo::getRegPressureSetLimit`.
Also improve tests.
2024-04-01 17:04:44 +09:00
David Green
601e102bdb
[CodeGen] Use LocationSize for MMO getSize (#84751)
This is part of #70452 that changes the type used for the external
interface of MMO to LocationSize as opposed to uint64_t. This means the
constructors take LocationSize, and convert ~UINT64_C(0) to
LocationSize::beforeOrAfter(). The getSize methods return a
LocationSize.

This allows us to be more precise with unknown sizes, not accidentally
treating them as unsigned values, and in the future should allow us to
add proper scalable vector support but none of that is included in this
patch. It should mostly be an NFC.

Global ISel is still expected to use the underlying LLT as it needs, and
are not expected to see unknown sizes for generic operations. Most of
the changes are hopefully fairly mechanical, adding a lot of getValue()
calls and protecting them with hasValue() where needed.
2024-03-17 18:15:56 +00:00
MalaySanghiIntel
0e337c67c8
Replace copy with a reference. (#82485)
These are relatively larger structures and we don't update them so ref
should be fine
2024-03-05 15:51:43 +08:00
Yuta Mukai
9ea9e93f4a
[MachinePipeliner] Fix elements being added while the list is iterated (#80805)
There is no need to add the elements of Objs twice, so the addition is
removed.
2024-02-22 09:17:10 +09:00
Kazu Hirata
7d269a4841 [CodeGen] Use range-based for loops (NFC) 2024-02-03 21:43:01 -08:00
Jie Fu
9f29050942 [CodeGen][MachinePipeliner] Fix -Wpessimizing-move in MachinePipeliner.cpp (NFC)
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1044:19: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 1044 |     CycleInstrs = std::move(Schedule.reorderInstructions(SSD, CycleInstrs));
      |                   ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1044:19: note: remove std::move call here
 1044 |     CycleInstrs = std::move(Schedule.reorderInstructions(SSD, CycleInstrs));
      |                   ^~~~~~~~~~                                              ~
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1395:21: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 1395 |     auto LastUses = std::move(computeLastUses(OrderedInsts, Stages));
      |                     ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1395:21: note: remove std::move call here
 1395 |     auto LastUses = std::move(computeLastUses(OrderedInsts, Stages));
      |                     ^~~~~~~~~~                                     ~
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1502:9: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 1502 |         std::move(computeMaxSetPressure(OrderedInsts, Stages, MaxStage + 1));
      |         ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:1502:9: note: remove std::move call here
 1502 |         std::move(computeMaxSetPressure(OrderedInsts, Stages, MaxStage + 1));
      |         ^~~~~~~~~~                                                         ~
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:3381:19: error: moving a temporary object prevents copy elision [-Werror,-Wpessimizing-move]
 3381 |     cycleInstrs = std::move(reorderInstructions(SSD, cycleInstrs));
      |                   ^
/Users/jiefu/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:3381:19: note: remove std::move call here
 3381 |     cycleInstrs = std::move(reorderInstructions(SSD, cycleInstrs));
      |                   ^~~~~~~~~~                                     ~
4 errors generated.
2024-01-22 16:17:18 +08:00
Ryotaro KASUGA
7556626dcf
[CodeGen][MachinePipeliner] Limit register pressure when scheduling (#74807)
In software pipelining, when searching for the Initiation Interval (II),
`MachinePipeliner` tries to reduce register pressure, but doesn't check
how many variables can actually be alive at the same time. As a result,
a lot of register spills/fills can be generated after register
allocation, which might cause performance degradation. To prevent such
cases, this patch adds a check phase that calculates the maximum
register pressure of the scheduled loop and reject it if the pressure is
too high. This can be enabled this by specifying
`pipeliner-register-pressure`. Additionally, an II search range is
currently fixed at 10, which is too small to find a schedule when the
above algorithm is applied. Therefore this patch also adds a new option
`pipeliner-ii-search-range` to specify the length of the range to
search. There is one more new option
`pipeliner-register-pressure-margin`, which can be used to estimate a
register pressure limit less than actual for conservative analysis.

Discourse thread:
https://discourse.llvm.org/t/considering-register-pressure-when-deciding-initiation-interval-in-machinepipeliner/74725
2024-01-22 17:06:37 +09:00
bcahoon
a19c7c403f
[MachinePipeliner] Fix store-store dependences (#72575)
The pipeliner needs to mark store-store order dependences as
loop carried dependences. Otherwise, the stores may be scheduled
further apart than the MII. The order dependences implies that
the first instance of the dependent store is scheduled before the
second instance of the source store instruction.
2023-12-11 21:10:34 -06:00
Kazu Hirata
f9306f6de3
[ADT] Rename llvm::erase_value to llvm::erase (NFC) (#70156)
C++20 comes with std::erase to erase a value from std::vector.  This
patch renames llvm::erase_value to llvm::erase for consistency with
C++20.

We could make llvm::erase more similar to std::erase by having it
return the number of elements removed, but I'm not doing that for now
because nobody seems to care about that in our code base.

Since there are only 50 occurrences of erase_value in our code base,
this patch replaces all of them with llvm::erase and deprecates
llvm::erase_value.
2023-10-24 23:03:13 -07:00
Fangrui Song
111fcb0df0 [llvm] Fix duplicate word typos. NFC
Those fixes were taken from https://reviews.llvm.org/D137338
2023-09-01 18:25:16 -07:00
Michael Maitland
85e3875ad7 [TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics
D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.

This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.

This commit as previously reverted since it missed renaming that came
down after rebasing. This version of the commit fixes those problems.

Differential Revision: https://reviews.llvm.org/D158568
2023-08-24 19:21:36 -07:00
Michael Maitland
71bfec762b Revert "[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics"
This reverts commit 5b854f2c23ea1b000cb4cac4c0fea77326c03d43.

Build still failing.
2023-08-24 15:37:27 -07:00
Michael Maitland
5b854f2c23 [TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics
D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.

This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.

This commit as previously reverted since it missed renaming that came
down after rebasing. This version of the commit fixes those problems.

Differential Revision: https://reviews.llvm.org/D158568
2023-08-24 15:25:42 -07:00
Michael Maitland
4d27dffb43 Revert "[TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics"
This reverts commit 030d33409568b2f0ea61116e83fd40ca27ba33ac.

This commit is causing build failures
2023-08-24 11:58:53 -07:00
Michael Maitland
030d334095 [TableGen] Rename ResourceCycles and StartAtCycle to clarify semantics
D150312 added a TODO:

TODO: consider renaming the field `StartAtCycle` and `Cycles` to
`AcquireAtCycle` and `ReleaseAtCycle` respectively, to stress the
fact that resource allocation is now represented as an interval,
relatively to the issue cycle of the instruction.

This patch implements that TODO. This naming clarifies how to use these
fields in the scheduler. In addition it was confusing that `StartAtCycle` was
singular but `Cycles` was plural. This renaming fixes this inconsistency.

Differential Revision: https://reviews.llvm.org/D158568
2023-08-24 11:20:37 -07:00
Sergei Barannikov
aa2d0fbc30 [MC] Add MCRegisterInfo::regunits for iteration over register units
Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D152098
2023-06-16 05:39:50 +03:00
Jay Foad
5022fc2ad3 [CodeGen] Make use of MachineInstr::all_defs and all_uses. NFCI.
Differential Revision: https://reviews.llvm.org/D151424
2023-06-01 19:17:34 +01:00
Kazu Hirata
c121f3a9fb [CodeGen] Use range-based for loops (NFC) 2023-04-08 16:22:39 -07:00
jacquesguan
50f2ce49e7 [MachineScheduler] Rename postprocessDAG to postProcessDAG. NFC
Rename postprocessDAG to camel case.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D146795
2023-03-30 10:47:20 +08:00
Jay Foad
d170a254a5 [CodeGen] Define and use MachineOperand::getOperandNo
This is a helper function to very slightly simplify many calls to
MachineInstruction::getOperandNo.

Differential Revision: https://reviews.llvm.org/D143250
2023-02-07 11:50:57 +00:00
Kazu Hirata
caa99a01f5 Use llvm::popcount instead of llvm::countPopulation(NFC) 2023-01-22 12:48:51 -08:00
Craig Topper
e72ca520bb [CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC
Use isPhysical/isVirtual methods.

Reviewed By: foad

Differential Revision: https://reviews.llvm.org/D141715
2023-01-13 14:38:08 -08:00
Thomas Preud'homme
c8be35293c [SWP] Recognize mem carried dep with different base
The loop-carried dependency detection logic in isLoopCarriedDep relies
on the load and store using the same definition for the base register.
This misses the case of post-increment loads and stores whose base
register are different PHI initialized from the same initial value.

This commit extends the logic to accept the load and store having
different PHI base address provided that they had the same initial value
when entering the loop and are incremented by the same amount in each
loop.

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D136463
2022-11-07 09:53:41 +00:00
Yuta Mukai
116838b151 [MachinePipeliner] Fix the interpretation of the scheduling model
The method of counting resource consumption is modified to be based on
"Cycles" value when DFA is not used.

The calculation of ResMII is modified to total "Cycles" and divide it
by the number of units for each resource. Previously, ResMII was
excessive because it was assumed that resources were consumed for
the cycles of "Latency" value.

The method of resource reservation is modified similarly. When a
value of "Cycles" is larger than 1, the resource is considered to be
consumed by 1 for cycles of its length from the scheduled cycle.
To realize this, ResourceManager maintains a resource table for all
slots. Previously, resource consumption was always 1 for 1 cycle
regardless of the value of "Cycles" or "Latency".

In addition, the number of micro operations per cycle is modified to
be constrained by "IssueWidth". To disable the constraint,
--pipeliner-force-issue-width=100 can be used.

For the case of using DFA, the scheduling results are unchanged.

Reviewed By: dpenry

Differential Revision: https://reviews.llvm.org/D133572
2022-09-16 09:51:48 +09:00
David Penry
9aca7b0217 [ModuloScheduler] Fix missing LLVM_DEBUG
Guard a debug message with LLVM_DEBUG

Differential Revision: https://reviews.llvm.org/D132895
2022-08-30 09:20:37 -07:00
David Penry
ced705c440 [ModuloSchedule] Add interface call to accept/reject SMS schedules
This interface allows a target to reject a proposed
SMS schedule.  For Hexagon/PowerPC, all schedules
are accepted, leaving behavior unchanged.  For ARM,
schedules which exceed register pressure limits are
rejected.

Also, two RegisterPressureTracker methods now need to be public so
that register pressure can be computed by more callers.

Reapplication of D128941/(reversion:D132037) with small fix.

Differential Revision: https://reviews.llvm.org/D132170
2022-08-22 12:10:13 -07:00
David Penry
1c9f0408bc Revert "[ModuloSchedule] Add interface call to accept/reject SMS schedules"
This reverts commit 8c4aea438c310816bb4e4f9a32d783381ef3182e.

Needed because buildbot failures (warnings) gave a clue that there was
a functional bug in the ARM rejection logic.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D132037
2022-08-17 09:32:43 -07:00
David Penry
8c4aea438c [ModuloSchedule] Add interface call to accept/reject SMS schedules
This interface allows a target to reject a proposed
SMS schedule.  For Hexagon/PowerPC, all schedules
are accepted, leaving behavior unchanged.  For ARM,
schedules which exceed register pressure limits are
rejected.

Also, two RegisterPressureTracker methods now need to be public so
that register pressure can be computed by more callers.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D128941
2022-08-17 08:13:26 -07:00
Matt Arsenault
8d0383eb69 CodeGen: Remove AliasAnalysis from regalloc
This was stored in LiveIntervals, but not actually used for anything
related to LiveIntervals. It was only used in one check for if a load
instruction is rematerializable. I also don't think this was entirely
correct, since it was implicitly assuming constant loads are also
dereferenceable.

Remove this and rely only on the invariant+dereferenceable flags in
the memory operand. Set the flag based on the AA query upfront. This
should have the same net benefit, but has the possible disadvantage of
making this AA query nonlazy.

Preserve the behavior of assuming pointsToConstantMemory implying
dereferenceable for now, but maybe this should be changed.
2022-07-18 17:23:41 -04:00
Kazu Hirata
9e6d1f4b5d [CodeGen] Qualify auto variables in for loops (NFC) 2022-07-17 01:33:28 -07:00
Kazu Hirata
4271a1ff33 [llvm] Call *set::insert without checking membership first (NFC) 2022-06-18 10:17:22 -07:00