210 Commits

Author SHA1 Message Date
Nikita Popov
f137c3d592
[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940)
This avoids doing a Triple -> std::string -> Triple round trip in lots
of places, now that the Module stores a Triple.
2025-03-12 17:35:09 +01:00
Nikita Popov
979c275097
[IR] Store Triple in Module (NFC) (#129868)
The module currently stores the target triple as a string. This means
that any code that wants to actually use the triple first has to
instantiate a Triple, which is somewhat expensive. The change in #121652
caused a moderate compile-time regression due to this. While it would be
easy enough to work around, I think that architecturally, it makes more
sense to store the parsed Triple in the module, so that it can always be
directly queried.

For this change, I've opted not to add any magic conversions between
std::string and Triple for backwards-compatibilty purses, and instead
write out needed Triple()s or str()s explicitly. This is because I think
a decent number of them should be changed to work on Triple as well, to
avoid unnecessary conversions back and forth.

The only interesting part in this patch is that the default triple is
Triple("") instead of Triple() to preserve existing behavior. The former
defaults to using the ELF object format instead of unknown object
format. We should fix that as well.
2025-03-06 10:27:47 +01:00
Alex Bradbury
dd662d8028
[RISCV] Handle ADD in RISCVInstrInfo::isCopyInstrImpl (#81123)
Split out from #77610 and features a test, as a buggy version of this
caused a regression when landing that patch (the previous version had a
typo picking the wrong register as the source).

This is also motivated by future changes to MachineCopyPropagation which will use this information to determine if we have been left with a nop mv.
2025-03-05 12:29:04 +00:00
Farzon Lotfi
dc764f5c68
[DirectX] initialize registers properties by calling addRegisterClass and computeRegisterProperties (#128818)
This fixes #126784 for the DirectX backend.
This bug was marked critical for DX so it is going to go in first. At
least one register class needs to be added via `addRegisterClass` for
`RegClassForVT` to be valid.
Further for costing information used by loop unroll and other
optimizations to be valid we need to call `computeRegisterProperties`.
This change does both of these.

The test cases confirm that we can fetch costing information off of
`getRegisterInfo` and that `DirectXTargetLowering` maps `i32` typed
registers to `DXILClassRegClass`.
2025-02-27 10:35:14 -05:00
Ashley Coleman
02c9dae814
[HLSL] Add support to lookup a ResourceBindingInfo from its use (#126556)
Adds `findByUse` which takes a `llvm::Value` from a use and resolves it
(as best as possible) back to the creation of that resource.

It may return multiple ResourceBindingInfo if the use comes from
branched control flow.

Fixes #125746
2025-02-18 17:29:23 -07:00
Matt Arsenault
ab2d330fea
TableGen: Generate reverseComposeSubRegIndices (#127050)
This is necessary to enable composing subregisters in peephole-opt.
For now use a brute force table to find the return value. The worst
case target is AMDGPU with a 399 x 399 entry table.
2025-02-17 22:11:26 +07:00
Vyacheslav Levytskyy
df122fc734
[SPIR-V] Change a way SPIR-V Backend API works with user facing options (#124745)
This PR fixes https://github.com/llvm/llvm-project/issues/124703:
* added a new API call `SPIRVTranslate` that is to replace entirely old
`SPIRVTranslateModule` after existing clients switch into the new
function;
* the new `SPIRVTranslate` doesn't require option parsing, replacing the
`Opts` argument with explicit `CodeGenOptLevel` and `Triple` arguments;
* the old `SPIRVTranslateModule` call is a wrapper for `SPIRVTranslate`,
it doesn't require option parsing either and doesn't hold any logic
inside except for converting string options into `CodeGenOptLevel` and
`Triple` arguments;
* usage of the extensions list is reworked to avoid writes to the global
cl::opt variable `lib/Target/SPIRV/SPIRVSubtarget.cpp::Extensions` --
instead a new class member in SPIRVSubtarget.cpp is implemented that
allows to replace supported extensions after SPIRVSubtarget.cpp is
created;
* both API calls don't require option parsing and don't write to global
cl::opt variables.

Other related/required changes:
* SPIRV::Capability::Shader is marked as an capability of lesser
priority for OpenCL environment (to remediate absence of the
"avoid-spirv-capabilities" command line option in API calls);
* unit tests are updated and extended to cover testing of a newer API
call;
* old API call is marked with TODO to remove it after existing clients
switch into the new function.
2025-01-28 17:33:11 +01:00
Vyacheslav Levytskyy
ac94fade60
[SPIR-V] Rename internal command line flags for optimization level and mtriple used when passing options into the translate API call (#123975)
Rename internal command line flags for optimization level and mtriple
used when passing options into the translate API call.
2025-01-22 23:16:49 +01:00
Vyacheslav Levytskyy
3ff9368e58
[SPIR-V] Ensure that Module resource is managed locally wrt. a unit test case and fix a memory leak (#123725)
Adding SPIRV to LLVM_ALL_TARGETS
(https://github.com/llvm/llvm-project/pull/119653) revealed a series of
minor compilation problems and sanitizer complaints. This PR is to move
unit tests resources (a Module ptr) from the class-scope to a local
scope of the class member function to be sure that before the test env
is teared down the ptr is released.
2025-01-21 12:37:02 +01:00
Emma Pilkington
dc0e258fe4
[AMDGPU] Remove Dwarf encodings for subregisters (#117891)
Previously, registers and subregisters mapped to the same Dwarf
encoding. We don't really have any way to refer to subregisters directly
from Dwarf, the expression emitter should instead use DW_OPs to stencil
out the subregister from the whole register. This was also confusing
tools that need to map back to the llvm reg (e.g. dwarfdump), since
getLLVMRegNum() would arbitrarily return the _LO16 register.
2025-01-06 14:51:16 -05:00
paperchalice
1562b70eaf
Reapply "[DomTreeUpdater] Move critical edge splitting code to updater" (#119547)
This relands commit #115111.
Use traditional way to update post dominator tree, i.e. break critical
edge splitting into insert, insert, delete sequence.
When splitting critical edges, the post dominator tree may change its
root node, and `setNewRoot` only works in normal dominator tree...
See

6c7e5827ed/llvm/include/llvm/Support/GenericDomTree.h (L684-L687)
2024-12-13 11:43:09 +08:00
paperchalice
553058f825
Revert "[DomTreeUpdater] Move critical edge splitting code to updater" (#119512)
Reverts llvm/llvm-project#115111 Causes #119511
2024-12-11 14:25:17 +08:00
paperchalice
79047fac65
[DomTreeUpdater] Move critical edge splitting code to updater (#115111)
Support critical edge splitting in dominator tree updater. Continue the
work in #100856.

Compile time check:
https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au
2024-12-11 11:31:42 +08:00
Jon Roelofs
b6c22a4e58
Add processor aliases back to -print-supported-cpus and -mcpu=help (#118581)
They were accidentally dropped in
https://github.com/llvm/llvm-project/pull/96249

rdar://140853882
2024-12-09 09:18:31 -08:00
Nathan Gauër
45b567be8d
[SPIR-V] Add partial order tests, assert reducible (#117887)
Add testing for the visitor and added a note explaining irreducible CFG
are not supported.
Related to #116692

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-11-28 16:33:01 +01:00
Nathan Gauër
53326ee0cf
[SPIR-V] Fix block sorting with irreducible CFG (#116996)
Block sorting was assuming reducible CFG. Meaning we always had a best
node to continue with. Irreducible CFG makes breaks this assumption, so
the algorithm looped indefinitely because no node was a valid candidate.

Fixes #116692

---------

Signed-off-by: Nathan Gauër <brioche@google.com>
2024-11-28 13:42:57 +01:00
Sander de Smalen
318c69de52 Reland "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)"
The issue with slow compile-time was caused by an assert in
AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuperRegsMarked'
after adding all the reserved registers. This call gets very expensive
after adding the _HI registers due to the way the function searches
in the 'Exception' list, which is expected to be a small list but isn't
(the patch added 190 _HI regs).

It was possible to rewrite the code in such a way that the _HI registers
are marked as reserved after the check. This makes the problem go away
entirely and restores compile-time to what it was before (tested for
`check-runtimes`, which previously showed a ~5x slowdown).

This reverts commits:
  1434d2ab215e3ea9c5f34689d056edd3d4423a78
  2704647fb7986673b89cef1def729e3b022e2607
2024-11-27 13:31:59 +00:00
Vitaly Buka
1434d2ab21
Revert "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)" (#117307)
Details in #114827

This reverts commit c1c68baf7e0fcaef1f4ee86b527210f1391b55f6.
2024-11-22 11:48:25 -08:00
Matin Raayai
bb3f5e1fed
Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234)
Following discussions in #110443, and the following earlier discussions
in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html,
https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this
PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine`
interface classes. More specifically:
1. Makes `TargetMachine` the only class implemented under
`TargetMachine.h` in the `Target` library.
2. `TargetMachine` contains target-specific interface functions that
relate to IR/CodeGen/MC constructs, whereas before (at least on paper)
it was supposed to have only IR/MC constructs. Any Target that doesn't
want to use the independent code generator simply does not implement
them, and returns either `false` or `nullptr`.
3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming
aims to make the purpose of `LLVMTargetMachine` clearer. Its interface
was moved under the CodeGen library, to further emphasis its usage in
Targets that use CodeGen directly.
4. Makes `TargetMachine` the only interface used across LLVM and its
projects. With these changes, `CodeGenCommonTMImpl` is simply a set of
shared function implementations of `TargetMachine`, and CodeGen users
don't need to static cast to `LLVMTargetMachine` every time they need a
CodeGen-specific feature of the `TargetMachine`.
5. More importantly, does not change any requirements regarding library
linking.

cc @arsenm @aeubanks
2024-11-14 13:30:05 -08:00
Sander de Smalen
c1c68baf7e
[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827)
This is a step towards enabling subreg liveness tracking for AArch64,
which requires that registers are fully covered by their subregisters,
as covered here #109797.

There are several changes in this patch:

* AArch64RegisterInfo.td and tests: Define the high bits like B0_HI,
H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some
register class, this added a register class which meant that we had to
update 'magic numbers' in several tests.

The use of ComposedSubRegIndex helped 'compress' the number of bits
required for the lanemask. The correctness of the masks is tested by an
explicit unit tests.

* LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for
register tuples, but with this change to describe the high bits, a
register like 'D0' will also have 'HasDisjunctSubRegs' set to true
(because it's fullly covered by S0 and S0_HI). The fix here is to
explicitly test if the register class is one of the known D/Q/Z tuples.
2024-11-14 09:09:13 +00:00
Yingwei Zheng
cacbe71af7
[Analysis] Avoid running transform passes that have just been run (#112092)
This patch adds a new analysis pass to track a set of passes and their
parameters to see if we can avoid running transform passes that have
just been run. The current implementation only skips redundant
InstCombine runs. I will add support for other passes in follow-up
patches.

RFC link:
https://discourse.llvm.org/t/rfc-pipeline-avoid-running-transform-passes-that-have-just-been-run/82467

Compile time improvement:
http://llvm-compile-time-tracker.com/compare.php?from=76007138f4ffd4e0f510d12b5e8cad529c21f24d&to=64134cf07ea7eb39c60320087c0c5afdc16c3a2b&stat=instructions%3Au
2024-11-07 07:52:14 +08:00
Petar Avramovic
84b7bcfcac
GlobalISel/MachineIRBuilder: Construct DstOp with VRegAttrs (#113581)
Allow construction of DstOp with VRegAttrs.
Also allow construction with register class or bank and LLT.
Intended to be used in lowering code for reg-bank-select where
new registers need to have both register bank and LLT.
Add support for new type of DstOp in CSEMIRBuilder.
2024-10-30 14:15:42 +01:00
Franklin
e45f9aa7fa
[AArch64] Initial sched model for Neoverse N3 (#106371)
References:

* Arm Neoverse N3 Software Optimization Guide
* Arm A64 Instruction Set for A-profile architecture
2024-09-19 19:22:24 +01:00
JOE1994
459a82e689 [llvm][unittests] Don't call raw_string_ostream::flush() (NFC)
raw_string_ostream::flush() is essentially a no-op (also specified in docs).
Don't call it in tests that aren't meant to test 'raw_string_ostream' itself.

p.s. remove a few redundant calls to raw_string_ostream::str()
2024-09-13 19:55:44 -04:00
Lu Weining
ffcebcdb96
[LoongArch] Implement Statepoint lowering (#108212)
The functionality has been validated in OpenHarmony's arkcompiler.
2024-09-12 18:05:13 +08:00
Vyacheslav Levytskyy
bca2b6d23f
[SPIR-V] Expose an API call to initialize SPIRV target and translate input LLVM IR module to SPIR-V (#107216)
The goal of this PR is to facilitate integration of SPIRV Backend into
misc 3rd party tools and libraries by means of exposing an API call that
translate LLVM module to SPIR-V and write results into a string as
binary SPIR-V output, providing diagnostics on fail and means of
configuring translation in a style of command line options.

An example of a use case may be Khronos Translator that provides
bidirectional translation LLVM IR <=> SPIR-V, where LLVM IR => SPIR-V
step may be substituted by the call to SPIR-V Backend API, implemented
by this PR.
2024-09-10 15:51:20 +02:00
Luke Lau
3d729571fd
[RISCV] Model dest EEW and fix peepholes not checking EEW (#105945)
Previously for vector peepholes that fold based on VL, we checked if the
VLMAX is the same as a proxy to check that the EEWs were the same. This
only worked at LMUL >= 1 because the EMULs of the Src output and user's
input had to be the same because the register classes needed to match.

At fractional LMULs we would have incorrectly folded something like
this:

    %x:vr = PseudoVADD_VV_MF4 $noreg, $noreg, $noreg, 4, 4 /* e16 */, 0
    %y:vr = PseudoVMV_V_V_MF8 $noreg, %x, 4, 3 /* e8 */, 0

This models the EEW of the destination operands of vector instructions
with a TSFlag, which is enough to fix the incorrect folding.

There's some overlap with the TargetOverlapConstraintType and
IsRVVWideningReduction. If we model the source operands as well we may
be able to subsume them.
2024-09-05 15:27:48 +08:00
Heejin Ahn
aecbc92410
[WebAssembly] Rename CATCH/CATCH_ALL to *_LEGACY (#107187)
This renames MIR instruction `CATCH` and `CATCH_ALL` to `CATCH_LEGACY`
and `CATCH_ALL_LEGACY` respectively.

Follow-up PRs for the new EH (exnref) implementation will use `CATCH`,
`CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF` as pseudo-instructions
that return extracted values or `exnref` or both, because we don't
currently support block return values in LLVM. So to give the old (real)
`CATCH`es and the new (pseudo) `CATCH`es different names, this attaches
`_LEGACY` prefix to the old names.

This also rearranges `WebAssemblyInstrControl.td` so that the old legacy
instructions are listed all together at the end.
2024-09-04 16:14:13 -07:00
Kyungwoo Lee
38c3855c9f
[NFC] Remove unused argument (FuncName) for parseMIR (#106144)
While working on a MIR unittest, I noticed that parseMIR includes an
unused argument that sets a function name. This is not only redundant
but also irrelevant, as parseMIR is designed to parse entire module, not
specific functions, even though most unittests contain a single function
per module. To streamline the API, I have removed this unnecessary
argument from parseMIR. However, if this argument was originally
included to enhance readability or for any other purpose, please let me
know.
2024-08-26 19:19:02 -07:00
Matt Arsenault
63e1647827
CodeGen: Remove MachineModuleInfo reference from MachineFunction (#100357)
This avoids another unserializable field. Move the DbgInfoAvailable
field into the AsmPrinter, which is only really a cache/convenience
bit for checking a direct IR module metadata check.
2024-07-26 13:10:08 +04:00
Nikita Popov
6a907699d8 Revert "[CodeGen] Remove applySplitCriticalEdges in MachineDominatorTree (#97055)"
This reverts commit c5e5088033fed170068d818c54af6862e449b545.

Causes large compile-time regressions.
2024-07-11 09:13:37 +02:00
paperchalice
c5e5088033
[CodeGen] Remove applySplitCriticalEdges in MachineDominatorTree (#97055)
Summary:
- Remove wrappers in `MachineDominatorTree`.
- Remove `MachineDominatorTree` update code in
`MachineBasicBlock::SplitCriticalEdge`.
- Use `MachineDomTreeUpdater` in passes which call
`MachineBasicBlock::SplitCriticalEdge` and preserve
`MachineDominatorTreeWrapperPass` or CFG analyses.

Commit abea99f65a97248974c02a5544eaf25fc4240056 introduced related
methods in 2014. Now we have SemiNCA based dominator tree in 2017 and
dominator tree updater, the solution adopted here seems a bit outdated.
2024-07-11 11:08:05 +08:00
Kazu Hirata
75bc20ff89
[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#97914) 2024-07-07 08:23:41 +09:00
Nikita Popov
74deadf196
[IRBuilder] Don't include Module.h (NFC) (#97159)
This used to be necessary to fetch the DataLayout, but isn't anymore.
2024-06-29 15:05:04 +02:00
Fangrui Song
3f24561bc1 [unittest] Include Module.h after #97023 2024-06-28 09:40:27 -07:00
Nikita Popov
4169338e75
[IR] Don't include Module.h in Analysis.h (NFC) (#97023)
Replace it with a forward declaration instead. Analysis.h is pulled in
by all passes, but not all passes need to access the module.
2024-06-28 14:30:47 +02:00
Fangrui Song
c5f5238a4a [SPIRV] #include "llvm/IR/PassInstrumentation.h" 2024-06-21 21:40:38 -07:00
Janek van Oirschot
3d1705d00c
MCExpr-ify AMDGPU PALMetadata (#93236)
Allows MCExprs as passed values to PALMetadata. Also adds related
`DelayedMCExpr` classes which serve as a pseudo-fixup to resolve MCExprs
as late as possible (i.e., right before emit through string or blob,
where they should be resolvable).
2024-06-13 13:59:31 +01:00
paperchalice
837dc542b1
[CodeGen][NewPM] Split MachineDominatorTree into a concrete analysis result (#94571)
Prepare for new pass manager version of `MachineDominatorTreeAnalysis`.
We may need a machine dominator tree version of `DomTreeUpdater` to
handle `SplitCriticalEdge` in some CodeGen passes.
2024-06-11 21:27:14 +08:00
Jay Foad
4d65887aac
[LLVM] Remove executable permission from some non-executable files (#93803) 2024-05-30 12:35:10 +01:00
Michael Kruse
4ecbfacf9e
[llvm] Revise IDE folder structure (#89741)
Update the folder titles for targets in the monorepository that have not
seen taken care of for some time. These are the folders that targets are
organized in Visual Studio and XCode
(`set_property(TARGET <target> PROPERTY FOLDER "<title>")`)
when using the respective CMake's IDE generator.

 * Ensure that every target is in a folder
 * Use a folder hierarchy with each LLVM subproject as a top-level folder
 * Use consistent folder names between subprojects
 * When using target-creating functions from AddLLVM.cmake, automatically
deduce the folder. This reduces the number of
`set_property`/`set_target_property`, but are still necessary when
`add_custom_target`, `add_executable`, `add_library`, etc. are used. A
LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's
root CMakeLists.txt.
2024-05-25 13:28:30 +02:00
Graham Hunter
36a3f8f647
[TTI][TLI][AArch64] Support scalable immediates with isLegalAddImmediate (#84173)
Adds a second parameter (default to 0) to isLegalAddImmediate, to
represent a scalable immediate.

Extends the AArch64 implementation to match immediates based on what addvl and inc[h|w|d] support.
2024-03-20 10:28:46 +00:00
Graham Hunter
cd768ec983
[AArch64] Support scalable offsets with isLegalAddressingMode (#83255)
Allows us to indicate that an addressing mode featuring a
vscale-relative immediate offset is supported.
2024-03-20 10:13:20 +00:00
Emma Pilkington
538aeb180b
[AMDGPU] Use a consistent DwarfEH register flavour (#84513)
Previously, we always used the wave64 encodings for EH registers
regardless of whether we were compiling for wave32, which seems wrong.
We don't seem to use the EH registers, so this commit is mostly just
about papering over code that converts from non-EH dwarf registers to
LLVM registers while claiming they are EH dwarf registers. That kind of
code should be okay on any non-darwin target (since darwin is the only
target that uses a different encoding for EH registers).
2024-03-11 10:36:38 -04:00
Sivan Shani
5e688f0dbd [llvm][arm] add T1 and T2 assembly options for vlldm and vlstm
Re-land 634b0243b8f7acc85af4f16b70e91d86ded4dc83.

T1 allow for an optional registers list,
the register list must be {d0-d15}.
T2 define a mandatory register list,
the register list must be {d0-d31}.

The requirements for T1/T2 are as follows:
                T1              T2
Require:        v8-M.Main,      v8.1-M.Main,
                secure state    secure state
16 D Regs       valid           valid
32 D Regs       UNDEFINED       valid
No D Regs       NOP             NOP
2024-03-11 14:27:28 +00:00
David Green
d36d805373 [AArch64] Ensure Neoverse-N2 scheduling model includes all SVE pseudos.
Similar to #84187, this enables the existing test we have for checking the
scheduling info of the pseudos matches the real instructions, and adjusts the
scheduling info in the NeoverseN2 model to make sure all cases were handled.
2024-03-08 09:52:49 +00:00
David Green
23c658ac41
[AArch64] Ensure Neoverse V1 scheduling model includes all SVE pseudos. (#84187)
With the many pseudos used in SVE codegen it can be too easy to miss
instructions. This enables the existing test we have for checking the
scheduling info of the pseudos matches the real instructions, and
adjusts the scheduling info in the NeoverseV1 model to make sure all are
handled. In the cases I could I opted to use the same info as in the
NeoverseV2 model, to keep the differences smaller.
2024-03-08 07:09:33 +00:00
Krzysztof Parzyszek
954f891af2 [Unittests] Fix RISCV unit tests build
/usr/bin/ld: CMakeFiles/RISCVTests.dir/RISCVInstrInfoTest.cpp.o: undefined
reference to symbol '_ZNK4llvm12LocationSize5printERNS_11raw_ostreamE'
/usr/bin/ld: /work/kparzysz/git/llvm.org/b/x86/lib/libLLVMAnalysis.so.19.0
git: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status

The undefined symbol is
llvm::LocationSize::print(llvm::raw_ostream&) const
2024-03-06 12:44:13 -06:00
David Green
44be5a7fdc
[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875)
This is another part of #70452 which makes getMemOperandsWithOffsetWidth
use a LocationSize for Width, as opposed to the unsigned it currently
uses. The advantages on it's own are not super high if
getMemOperandsWithOffsetWidth usually uses known sizes, but if the
values can come from an MMO it can help be more accurate in case they
are Unknown (and in the future, scalable).
2024-03-06 17:40:13 +00:00
Wang Pengcheng
85388a06b6
[RISCV] Move RISCVVType namespace to TargetParser (#83222)
Clang and some middle-end optimizations may need these helper
functions.

This can reduce some duplications.
2024-03-06 10:56:19 +08:00