llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-04-29 08:26:07 +00:00

Author	SHA1	Message	Date
Nikita Popov	f137c3d592	[TargetRegistry] Accept Triple in createTargetMachine() (NFC) (#130940 ) This avoids doing a Triple -> std::string -> Triple round trip in lots of places, now that the Module stores a Triple.	2025-03-12 17:35:09 +01:00
Nikita Popov	979c275097	[IR] Store Triple in Module (NFC) (#129868 ) The module currently stores the target triple as a string. This means that any code that wants to actually use the triple first has to instantiate a Triple, which is somewhat expensive. The change in #121652 caused a moderate compile-time regression due to this. While it would be easy enough to work around, I think that architecturally, it makes more sense to store the parsed Triple in the module, so that it can always be directly queried. For this change, I've opted not to add any magic conversions between std::string and Triple for backwards-compatibilty purses, and instead write out needed Triple()s or str()s explicitly. This is because I think a decent number of them should be changed to work on Triple as well, to avoid unnecessary conversions back and forth. The only interesting part in this patch is that the default triple is Triple("") instead of Triple() to preserve existing behavior. The former defaults to using the ELF object format instead of unknown object format. We should fix that as well.	2025-03-06 10:27:47 +01:00
Alex Bradbury	dd662d8028	[RISCV] Handle ADD in RISCVInstrInfo::isCopyInstrImpl (#81123 ) Split out from #77610 and features a test, as a buggy version of this caused a regression when landing that patch (the previous version had a typo picking the wrong register as the source). This is also motivated by future changes to MachineCopyPropagation which will use this information to determine if we have been left with a nop mv.	2025-03-05 12:29:04 +00:00
Farzon Lotfi	dc764f5c68	[DirectX] initialize registers properties by calling addRegisterClass and computeRegisterProperties (#128818 ) This fixes #126784 for the DirectX backend. This bug was marked critical for DX so it is going to go in first. At least one register class needs to be added via `addRegisterClass` for `RegClassForVT` to be valid. Further for costing information used by loop unroll and other optimizations to be valid we need to call `computeRegisterProperties`. This change does both of these. The test cases confirm that we can fetch costing information off of `getRegisterInfo` and that `DirectXTargetLowering` maps `i32` typed registers to `DXILClassRegClass`.	2025-02-27 10:35:14 -05:00
Ashley Coleman	02c9dae814	[HLSL] Add support to lookup a ResourceBindingInfo from its use (#126556 ) Adds `findByUse` which takes a `llvm::Value` from a use and resolves it (as best as possible) back to the creation of that resource. It may return multiple ResourceBindingInfo if the use comes from branched control flow. Fixes #125746	2025-02-18 17:29:23 -07:00
Matt Arsenault	ab2d330fea	TableGen: Generate reverseComposeSubRegIndices (#127050 ) This is necessary to enable composing subregisters in peephole-opt. For now use a brute force table to find the return value. The worst case target is AMDGPU with a 399 x 399 entry table.	2025-02-17 22:11:26 +07:00
Vyacheslav Levytskyy	df122fc734	[SPIR-V] Change a way SPIR-V Backend API works with user facing options (#124745 ) This PR fixes https://github.com/llvm/llvm-project/issues/124703: * added a new API call `SPIRVTranslate` that is to replace entirely old `SPIRVTranslateModule` after existing clients switch into the new function; * the new `SPIRVTranslate` doesn't require option parsing, replacing the `Opts` argument with explicit `CodeGenOptLevel` and `Triple` arguments; * the old `SPIRVTranslateModule` call is a wrapper for `SPIRVTranslate`, it doesn't require option parsing either and doesn't hold any logic inside except for converting string options into `CodeGenOptLevel` and `Triple` arguments; * usage of the extensions list is reworked to avoid writes to the global cl::opt variable `lib/Target/SPIRV/SPIRVSubtarget.cpp::Extensions` -- instead a new class member in SPIRVSubtarget.cpp is implemented that allows to replace supported extensions after SPIRVSubtarget.cpp is created; * both API calls don't require option parsing and don't write to global cl::opt variables. Other related/required changes: * SPIRV::Capability::Shader is marked as an capability of lesser priority for OpenCL environment (to remediate absence of the "avoid-spirv-capabilities" command line option in API calls); * unit tests are updated and extended to cover testing of a newer API call; * old API call is marked with TODO to remove it after existing clients switch into the new function.	2025-01-28 17:33:11 +01:00
Vyacheslav Levytskyy	ac94fade60	[SPIR-V] Rename internal command line flags for optimization level and mtriple used when passing options into the translate API call (#123975 ) Rename internal command line flags for optimization level and mtriple used when passing options into the translate API call.	2025-01-22 23:16:49 +01:00
Vyacheslav Levytskyy	3ff9368e58	[SPIR-V] Ensure that Module resource is managed locally wrt. a unit test case and fix a memory leak (#123725 ) Adding SPIRV to LLVM_ALL_TARGETS (https://github.com/llvm/llvm-project/pull/119653) revealed a series of minor compilation problems and sanitizer complaints. This PR is to move unit tests resources (a Module ptr) from the class-scope to a local scope of the class member function to be sure that before the test env is teared down the ptr is released.	2025-01-21 12:37:02 +01:00
Emma Pilkington	dc0e258fe4	[AMDGPU] Remove Dwarf encodings for subregisters (#117891 ) Previously, registers and subregisters mapped to the same Dwarf encoding. We don't really have any way to refer to subregisters directly from Dwarf, the expression emitter should instead use DW_OPs to stencil out the subregister from the whole register. This was also confusing tools that need to map back to the llvm reg (e.g. dwarfdump), since getLLVMRegNum() would arbitrarily return the _LO16 register.	2025-01-06 14:51:16 -05:00
paperchalice	1562b70eaf	Reapply "[DomTreeUpdater] Move critical edge splitting code to updater" (#119547 ) This relands commit #115111. Use traditional way to update post dominator tree, i.e. break critical edge splitting into insert, insert, delete sequence. When splitting critical edges, the post dominator tree may change its root node, and `setNewRoot` only works in normal dominator tree... See `6c7e5827ed/llvm/include/llvm/Support/GenericDomTree.h (L684-L687)`	2024-12-13 11:43:09 +08:00
paperchalice	553058f825	Revert "[DomTreeUpdater] Move critical edge splitting code to updater" (#119512 ) Reverts llvm/llvm-project#115111 Causes #119511	2024-12-11 14:25:17 +08:00
paperchalice	79047fac65	[DomTreeUpdater] Move critical edge splitting code to updater (#115111 ) Support critical edge splitting in dominator tree updater. Continue the work in #100856. Compile time check: https://llvm-compile-time-tracker.com/compare.php?from=87c35d782795b54911b3e3a91a5b738d4d870e55&to=42b3e5623a9ab4c3648564dc0926b36f3b438a3a&stat=instructions%3Au	2024-12-11 11:31:42 +08:00
Jon Roelofs	b6c22a4e58	Add processor aliases back to -print-supported-cpus and -mcpu=help (#118581 ) They were accidentally dropped in https://github.com/llvm/llvm-project/pull/96249 rdar://140853882	2024-12-09 09:18:31 -08:00
Nathan Gauër	45b567be8d	[SPIR-V] Add partial order tests, assert reducible (#117887 ) Add testing for the visitor and added a note explaining irreducible CFG are not supported. Related to #116692 --------- Signed-off-by: Nathan Gauër <brioche@google.com>	2024-11-28 16:33:01 +01:00
Nathan Gauër	53326ee0cf	[SPIR-V] Fix block sorting with irreducible CFG (#116996 ) Block sorting was assuming reducible CFG. Meaning we always had a best node to continue with. Irreducible CFG makes breaks this assumption, so the algorithm looped indefinitely because no node was a valid candidate. Fixes #116692 --------- Signed-off-by: Nathan Gauër <brioche@google.com>	2024-11-28 13:42:57 +01:00
Sander de Smalen	318c69de52	Reland "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827 )" The issue with slow compile-time was caused by an assert in AArch64RegisterInfo.cpp. The assert invokes 'checkAllSuperRegsMarked' after adding all the reserved registers. This call gets very expensive after adding the _HI registers due to the way the function searches in the 'Exception' list, which is expected to be a small list but isn't (the patch added 190 _HI regs). It was possible to rewrite the code in such a way that the _HI registers are marked as reserved after the check. This makes the problem go away entirely and restores compile-time to what it was before (tested for `check-runtimes`, which previously showed a ~5x slowdown). This reverts commits: 1434d2ab215e3ea9c5f34689d056edd3d4423a78 2704647fb7986673b89cef1def729e3b022e2607	2024-11-27 13:31:59 +00:00
Vitaly Buka	1434d2ab21	Revert "[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827 )" (#117307 ) Details in #114827 This reverts commit c1c68baf7e0fcaef1f4ee86b527210f1391b55f6.	2024-11-22 11:48:25 -08:00
Matin Raayai	bb3f5e1fed	Overhaul the TargetMachine and LLVMTargetMachine Classes (#111234 ) Following discussions in #110443, and the following earlier discussions in https://lists.llvm.org/pipermail/llvm-dev/2017-October/117907.html, https://reviews.llvm.org/D38482, https://reviews.llvm.org/D38489, this PR attempts to overhaul the `TargetMachine` and `LLVMTargetMachine` interface classes. More specifically: 1. Makes `TargetMachine` the only class implemented under `TargetMachine.h` in the `Target` library. 2. `TargetMachine` contains target-specific interface functions that relate to IR/CodeGen/MC constructs, whereas before (at least on paper) it was supposed to have only IR/MC constructs. Any Target that doesn't want to use the independent code generator simply does not implement them, and returns either `false` or `nullptr`. 3. Renames `LLVMTargetMachine` to `CodeGenCommonTMImpl`. This renaming aims to make the purpose of `LLVMTargetMachine` clearer. Its interface was moved under the CodeGen library, to further emphasis its usage in Targets that use CodeGen directly. 4. Makes `TargetMachine` the only interface used across LLVM and its projects. With these changes, `CodeGenCommonTMImpl` is simply a set of shared function implementations of `TargetMachine`, and CodeGen users don't need to static cast to `LLVMTargetMachine` every time they need a CodeGen-specific feature of the `TargetMachine`. 5. More importantly, does not change any requirements regarding library linking. cc @arsenm @aeubanks	2024-11-14 13:30:05 -08:00
Sander de Smalen	c1c68baf7e	[AArch64] Define high bits of FPR and GPR registers (take 2) (#114827 ) This is a step towards enabling subreg liveness tracking for AArch64, which requires that registers are fully covered by their subregisters, as covered here #109797. There are several changes in this patch: * AArch64RegisterInfo.td and tests: Define the high bits like B0_HI, H0_HI, S0_HI, D0_HI, Q0_HI. Because the bits must be defined by some register class, this added a register class which meant that we had to update 'magic numbers' in several tests. The use of ComposedSubRegIndex helped 'compress' the number of bits required for the lanemask. The correctness of the masks is tested by an explicit unit tests. * LoadStoreOptimizer: previously 'HasDisjunctSubRegs' was only true for register tuples, but with this change to describe the high bits, a register like 'D0' will also have 'HasDisjunctSubRegs' set to true (because it's fullly covered by S0 and S0_HI). The fix here is to explicitly test if the register class is one of the known D/Q/Z tuples.	2024-11-14 09:09:13 +00:00
Yingwei Zheng	cacbe71af7	[Analysis] Avoid running transform passes that have just been run (#112092 ) This patch adds a new analysis pass to track a set of passes and their parameters to see if we can avoid running transform passes that have just been run. The current implementation only skips redundant InstCombine runs. I will add support for other passes in follow-up patches. RFC link: https://discourse.llvm.org/t/rfc-pipeline-avoid-running-transform-passes-that-have-just-been-run/82467 Compile time improvement: http://llvm-compile-time-tracker.com/compare.php?from=76007138f4ffd4e0f510d12b5e8cad529c21f24d&to=64134cf07ea7eb39c60320087c0c5afdc16c3a2b&stat=instructions%3Au	2024-11-07 07:52:14 +08:00
Petar Avramovic	84b7bcfcac	GlobalISel/MachineIRBuilder: Construct DstOp with VRegAttrs (#113581 ) Allow construction of DstOp with VRegAttrs. Also allow construction with register class or bank and LLT. Intended to be used in lowering code for reg-bank-select where new registers need to have both register bank and LLT. Add support for new type of DstOp in CSEMIRBuilder.	2024-10-30 14:15:42 +01:00
Franklin	e45f9aa7fa	[AArch64] Initial sched model for Neoverse N3 (#106371 ) References: * Arm Neoverse N3 Software Optimization Guide * Arm A64 Instruction Set for A-profile architecture	2024-09-19 19:22:24 +01:00
JOE1994	459a82e689	[llvm][unittests] Don't call raw_string_ostream::flush() (NFC) raw_string_ostream::flush() is essentially a no-op (also specified in docs). Don't call it in tests that aren't meant to test 'raw_string_ostream' itself. p.s. remove a few redundant calls to raw_string_ostream::str()	2024-09-13 19:55:44 -04:00
Lu Weining	ffcebcdb96	[LoongArch] Implement Statepoint lowering (#108212 ) The functionality has been validated in OpenHarmony's arkcompiler.	2024-09-12 18:05:13 +08:00
Vyacheslav Levytskyy	bca2b6d23f	[SPIR-V] Expose an API call to initialize SPIRV target and translate input LLVM IR module to SPIR-V (#107216 ) The goal of this PR is to facilitate integration of SPIRV Backend into misc 3rd party tools and libraries by means of exposing an API call that translate LLVM module to SPIR-V and write results into a string as binary SPIR-V output, providing diagnostics on fail and means of configuring translation in a style of command line options. An example of a use case may be Khronos Translator that provides bidirectional translation LLVM IR <=> SPIR-V, where LLVM IR => SPIR-V step may be substituted by the call to SPIR-V Backend API, implemented by this PR.	2024-09-10 15:51:20 +02:00
Luke Lau	3d729571fd	[RISCV] Model dest EEW and fix peepholes not checking EEW (#105945 ) Previously for vector peepholes that fold based on VL, we checked if the VLMAX is the same as a proxy to check that the EEWs were the same. This only worked at LMUL >= 1 because the EMULs of the Src output and user's input had to be the same because the register classes needed to match. At fractional LMULs we would have incorrectly folded something like this: %x:vr = PseudoVADD_VV_MF4 $noreg, $noreg, $noreg, 4, 4 /* e16 /, 0 %y:vr = PseudoVMV_V_V_MF8 $noreg, %x, 4, 3 / e8 */, 0 This models the EEW of the destination operands of vector instructions with a TSFlag, which is enough to fix the incorrect folding. There's some overlap with the TargetOverlapConstraintType and IsRVVWideningReduction. If we model the source operands as well we may be able to subsume them.	2024-09-05 15:27:48 +08:00
Heejin Ahn	aecbc92410	[WebAssembly] Rename CATCH/CATCH_ALL to *_LEGACY (#107187 ) This renames MIR instruction `CATCH` and `CATCH_ALL` to `CATCH_LEGACY` and `CATCH_ALL_LEGACY` respectively. Follow-up PRs for the new EH (exnref) implementation will use `CATCH`, `CATCH_REF`, `CATCH_ALL`, and `CATCH_ALL_REF` as pseudo-instructions that return extracted values or `exnref` or both, because we don't currently support block return values in LLVM. So to give the old (real) `CATCH`es and the new (pseudo) `CATCH`es different names, this attaches `_LEGACY` prefix to the old names. This also rearranges `WebAssemblyInstrControl.td` so that the old legacy instructions are listed all together at the end.	2024-09-04 16:14:13 -07:00
Kyungwoo Lee	38c3855c9f	[NFC] Remove unused argument (FuncName) for parseMIR (#106144 ) While working on a MIR unittest, I noticed that parseMIR includes an unused argument that sets a function name. This is not only redundant but also irrelevant, as parseMIR is designed to parse entire module, not specific functions, even though most unittests contain a single function per module. To streamline the API, I have removed this unnecessary argument from parseMIR. However, if this argument was originally included to enhance readability or for any other purpose, please let me know.	2024-08-26 19:19:02 -07:00
Matt Arsenault	63e1647827	CodeGen: Remove MachineModuleInfo reference from MachineFunction (#100357 ) This avoids another unserializable field. Move the DbgInfoAvailable field into the AsmPrinter, which is only really a cache/convenience bit for checking a direct IR module metadata check.	2024-07-26 13:10:08 +04:00
Nikita Popov	6a907699d8	Revert "[CodeGen] Remove `applySplitCriticalEdges` in `MachineDominatorTree` (#97055 )" This reverts commit c5e5088033fed170068d818c54af6862e449b545. Causes large compile-time regressions.	2024-07-11 09:13:37 +02:00
paperchalice	c5e5088033	[CodeGen] Remove `applySplitCriticalEdges` in `MachineDominatorTree` (#97055 ) Summary: - Remove wrappers in `MachineDominatorTree`. - Remove `MachineDominatorTree` update code in `MachineBasicBlock::SplitCriticalEdge`. - Use `MachineDomTreeUpdater` in passes which call `MachineBasicBlock::SplitCriticalEdge` and preserve `MachineDominatorTreeWrapperPass` or CFG analyses. Commit abea99f65a97248974c02a5544eaf25fc4240056 introduced related methods in 2014. Now we have SemiNCA based dominator tree in 2017 and dominator tree updater, the solution adopted here seems a bit outdated.	2024-07-11 11:08:05 +08:00
Kazu Hirata	75bc20ff89	[llvm] Remove redundant calls to std::unique_ptr<T>::get (NFC) (#97914 )	2024-07-07 08:23:41 +09:00
Nikita Popov	74deadf196	[IRBuilder] Don't include Module.h (NFC) (#97159 ) This used to be necessary to fetch the DataLayout, but isn't anymore.	2024-06-29 15:05:04 +02:00
Fangrui Song	3f24561bc1	[unittest] Include Module.h after #97023	2024-06-28 09:40:27 -07:00
Nikita Popov	4169338e75	[IR] Don't include Module.h in Analysis.h (NFC) (#97023 ) Replace it with a forward declaration instead. Analysis.h is pulled in by all passes, but not all passes need to access the module.	2024-06-28 14:30:47 +02:00
Fangrui Song	c5f5238a4a	[SPIRV] #include "llvm/IR/PassInstrumentation.h"	2024-06-21 21:40:38 -07:00
Janek van Oirschot	3d1705d00c	MCExpr-ify AMDGPU PALMetadata (#93236 ) Allows MCExprs as passed values to PALMetadata. Also adds related `DelayedMCExpr` classes which serve as a pseudo-fixup to resolve MCExprs as late as possible (i.e., right before emit through string or blob, where they should be resolvable).	2024-06-13 13:59:31 +01:00
paperchalice	837dc542b1	[CodeGen][NewPM] Split `MachineDominatorTree` into a concrete analysis result (#94571 ) Prepare for new pass manager version of `MachineDominatorTreeAnalysis`. We may need a machine dominator tree version of `DomTreeUpdater` to handle `SplitCriticalEdge` in some CodeGen passes.	2024-06-11 21:27:14 +08:00
Jay Foad	4d65887aac	[LLVM] Remove executable permission from some non-executable files (#93803 )	2024-05-30 12:35:10 +01:00
Michael Kruse	4ecbfacf9e	[llvm] Revise IDE folder structure (#89741 ) Update the folder titles for targets in the monorepository that have not seen taken care of for some time. These are the folders that targets are organized in Visual Studio and XCode (`set_property(TARGET <target> PROPERTY FOLDER "<title>")`) when using the respective CMake's IDE generator. * Ensure that every target is in a folder * Use a folder hierarchy with each LLVM subproject as a top-level folder * Use consistent folder names between subprojects * When using target-creating functions from AddLLVM.cmake, automatically deduce the folder. This reduces the number of `set_property`/`set_target_property`, but are still necessary when `add_custom_target`, `add_executable`, `add_library`, etc. are used. A LLVM_SUBPROJECT_TITLE definition is used for that in each subproject's root CMakeLists.txt.	2024-05-25 13:28:30 +02:00
Graham Hunter	36a3f8f647	[TTI][TLI][AArch64] Support scalable immediates with isLegalAddImmediate (#84173 ) Adds a second parameter (default to 0) to isLegalAddImmediate, to represent a scalable immediate. Extends the AArch64 implementation to match immediates based on what addvl and inc[h\|w\|d] support.	2024-03-20 10:28:46 +00:00
Graham Hunter	cd768ec983	[AArch64] Support scalable offsets with isLegalAddressingMode (#83255 ) Allows us to indicate that an addressing mode featuring a vscale-relative immediate offset is supported.	2024-03-20 10:13:20 +00:00
Emma Pilkington	538aeb180b	[AMDGPU] Use a consistent DwarfEH register flavour (#84513 ) Previously, we always used the wave64 encodings for EH registers regardless of whether we were compiling for wave32, which seems wrong. We don't seem to use the EH registers, so this commit is mostly just about papering over code that converts from non-EH dwarf registers to LLVM registers while claiming they are EH dwarf registers. That kind of code should be okay on any non-darwin target (since darwin is the only target that uses a different encoding for EH registers).	2024-03-11 10:36:38 -04:00
Sivan Shani	5e688f0dbd	[llvm][arm] add T1 and T2 assembly options for vlldm and vlstm Re-land 634b0243b8f7acc85af4f16b70e91d86ded4dc83. T1 allow for an optional registers list, the register list must be {d0-d15}. T2 define a mandatory register list, the register list must be {d0-d31}. The requirements for T1/T2 are as follows: T1 T2 Require: v8-M.Main, v8.1-M.Main, secure state secure state 16 D Regs valid valid 32 D Regs UNDEFINED valid No D Regs NOP NOP	2024-03-11 14:27:28 +00:00
David Green	d36d805373	[AArch64] Ensure Neoverse-N2 scheduling model includes all SVE pseudos. Similar to #84187, this enables the existing test we have for checking the scheduling info of the pseudos matches the real instructions, and adjusts the scheduling info in the NeoverseN2 model to make sure all cases were handled.	2024-03-08 09:52:49 +00:00
David Green	23c658ac41	[AArch64] Ensure Neoverse V1 scheduling model includes all SVE pseudos. (#84187 ) With the many pseudos used in SVE codegen it can be too easy to miss instructions. This enables the existing test we have for checking the scheduling info of the pseudos matches the real instructions, and adjusts the scheduling info in the NeoverseV1 model to make sure all are handled. In the cases I could I opted to use the same info as in the NeoverseV2 model, to keep the differences smaller.	2024-03-08 07:09:33 +00:00
Krzysztof Parzyszek	954f891af2	[Unittests] Fix RISCV unit tests build /usr/bin/ld: CMakeFiles/RISCVTests.dir/RISCVInstrInfoTest.cpp.o: undefined reference to symbol '_ZNK4llvm12LocationSize5printERNS_11raw_ostreamE' /usr/bin/ld: /work/kparzysz/git/llvm.org/b/x86/lib/libLLVMAnalysis.so.19.0 git: error adding symbols: DSO missing from command line collect2: error: ld returned 1 exit status The undefined symbol is llvm::LocationSize::print(llvm::raw_ostream&) const	2024-03-06 12:44:13 -06:00
David Green	44be5a7fdc	[Codegen] Make Width in getMemOperandsWithOffsetWidth a LocationSize. (#83875 ) This is another part of #70452 which makes getMemOperandsWithOffsetWidth use a LocationSize for Width, as opposed to the unsigned it currently uses. The advantages on it's own are not super high if getMemOperandsWithOffsetWidth usually uses known sizes, but if the values can come from an MMO it can help be more accurate in case they are Unknown (and in the future, scalable).	2024-03-06 17:40:13 +00:00
Wang Pengcheng	85388a06b6	[RISCV] Move RISCVVType namespace to TargetParser (#83222 ) Clang and some middle-end optimizations may need these helper functions. This can reduce some duplications.	2024-03-06 10:56:19 +08:00

1 2 3 4 5

210 Commits