llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-13 23:26:06 +00:00

Author	SHA1	Message	Date
Xu Zhang	f6d431f208	[CodeGen] Make the parameter TRI required in some functions. (#85968 ) Fixes #82659 There are some functions, such as `findRegisterDefOperandIdx` and `findRegisterDefOperand`, that have too many default parameters. As a result, we have encountered some issues due to the lack of TRI parameters, as shown in issue #82411. Following @RKSimon 's suggestion, this patch refactors 9 functions, including `{reads, kills, defines, modifies}Register`, `registerDefIsDead`, and `findRegister{UseOperandIdx, UseOperand, DefOperandIdx, DefOperand}`, adjusting the order of the TRI parameter and making it required. In addition, all the places that call these functions have also been updated correctly to ensure no additional impact. After this, the caller of these functions should explicitly know whether to pass the `TargetRegisterInfo` or just a `nullptr`.	2024-04-24 14:24:14 +01:00
Sameer Sahasrabuddhe	60822637bf	Restore "Implement convergence control in MIR using SelectionDAG (#71785 )" This restores commit c7fdd8c11e54585dc9d15d63de9742067e0506b9. Previously reverted in f010b1bef4dda2c7082cbb41dbabf1f149cce306. LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.	2024-03-06 12:19:32 +05:30
Mitch Phillips	f010b1bef4	Revert "Restore "Implement convergence control in MIR using SelectionDAG (#71785 )"" This reverts commit c7fdd8c11e54585dc9d15d63de9742067e0506b9. Reason: Broke the sanitizer buildbots. See the comments at https://github.com/llvm/llvm-project/pull/71785 for more information.	2024-03-04 17:05:34 +01:00
Sameer Sahasrabuddhe	c7fdd8c11e	Restore "Implement convergence control in MIR using SelectionDAG (#71785 )" Original commit 79889734b940356ab3381423c93ae06f22e772c9. Perviously reverted in commit a2afcd5721869d1d03c8146bae3885b3385ba15e. LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.	2024-03-04 13:28:04 +05:30
Sameer Sahasrabuddhe	a2afcd5721	Revert "Implement convergence control in MIR using SelectionDAG (#71785 )" This reverts commit 79889734b940356ab3381423c93ae06f22e772c9. Encountered multiple buildbot failures.	2024-02-21 11:07:02 +05:30
Sameer Sahasrabuddhe	79889734b9	Implement convergence control in MIR using SelectionDAG (#71785 ) LLVM function calls carry convergence control tokens as operand bundles, where the tokens themselves are produced by convergence control intrinsics. This patch implements convergence control tokens in MIR as follows: 1. Introduce target-independent ISD opcodes and MIR opcodes for convergence control intrinsics. 2. Model token values as untyped virtual registers in MIR. The change also introduces an additional ISD opcode CONVERGENCECTRL_GLUE and a corresponding machine opcode with the same spelling. This glues the convergence control token to SDNodes that represent calls to intrinsics. The glued token is later translated to an implicit argument in the MIR. The lowering of calls to user-defined functions is target-specific. On AMDGPU, the convergence control operand bundle at a non-intrinsic call is translated to an explicit argument to the SI_CALL_ISEL instruction. Post-selection adjustment converts this explicit argument to an implicit argument on the SI_CALL instruction.	2024-02-21 10:06:37 +05:30
Alex Bradbury	197214e39b	[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710 ) This follows on from #76708, allowing `cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just `N->getAsZextVal();` Introduced via `git grep -l "cast<ConstantSDNode>$.$.getZExtValue" \| xargs sed -E -i 's/cast<ConstantSDNode>$(.*)$->getZExtValue/\1->getAsZExtVal/'` and then using `git clang-format` on the result.	2024-01-09 12:25:17 +00:00
Alex Bradbury	80aeb62211	[llvm][NFC] Use SDValue::getConstantOperandVal(i) where possible (#76708 ) This helper function shortens examples like `cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to `Node->getConstantOperandVal(1);`. Implemented with: `git grep -l "cast<ConstantSDNode>$.->getOperand\(.$\)->getZExtValue" \| xargs sed -E -i 's/cast<ConstantSDNode>$(.)->getOperand\((.)$\)->getZExtValue/\1->getConstantOperandVal(\2)/` and `git grep -l "cast<ConstantSDNode>$.\.getOperand\(.$\)->getZExtValue" \| xargs sed -E -i 's/cast<ConstantSDNode>$(.)\.getOperand\((.)$\)->getZExtValue/\1.getConstantOperandVal(\2)/'`. With a couple of simple manual fixes needed. Result then processed by `git clang-format`.	2024-01-02 13:14:28 +00:00
Nick Desaulniers	93bd428742	[InlineAsm] refactor InlineAsm class NFC (#65649 ) I would like to steal one of these bits to denote whether a kind may be spilled by the register allocator or not, but I'm afraid to touch of any this code using bitwise operands. Make flags a first class type using bitfields, rather than launder data around via `unsigned`.	2023-09-11 09:27:37 -07:00
Nick Desaulniers	2fad6e6985	[InlineAsm] wrap Kind in enum class NFC Should add some minor type safety to the use of this information, since there's quite a bit of metadata being laundered through an `unsigned`. I'm looking to potentially add more bitfields to that `unsigned`, but I find InlineAsm's big ol' bag of enum values and usage of `unsigned` confusing, type-unsafe, and un-ergonomic. These can probably be better abstracted. I think the lack of static_cast outside of InlineAsm indicates the prior code smell fixed here. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D159242	2023-08-31 08:54:51 -07:00
Dávid Bolvanský	09515f2c20	[SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in LLVM IR, but currently it is ignored by X86 backend. Propagate this metadata and avoid cmov->branch conversion in X86CmovConversion for cmov with this metadata. Example: ``` int MaxIndex(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is converted to branch by X86CmovConversion if (a[i] > a[t]) t = i; } return t; } int MaxIndex2(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is preserved if (__builtin_unpredictable(a[i] > a[t])) t = i; } return t; } ``` Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D118118	2023-06-01 20:56:44 +02:00
Serge Pavlov	c6795b1d37	Use ArrayRef instead of raw pointers. NFC Change signature of TargetLowering::getRoundingControlRegisters so that it returns ArrayRef, not plain pointer. Differential Revision: https://reviews.llvm.org/D143049	2023-02-02 12:53:49 +07:00
Phoebe Wang	2bb960302a	[X86][ConstraintFP] Model `MXCSR` for function call This patch is inspired by D111433. It would affect the performance under strict FP mode. But it preserves the correct rounding behavior accross function calls. Fixes #59305 Reviewed By: sepavloff Differential Revision: https://reviews.llvm.org/D139549	2023-01-27 10:07:38 +08:00
Jay Foad	073401e59c	[MC] Define and use MCInstrDesc implicit_uses and implicit_defs. NFC. The new methods return a range for easier iteration. Use them everywhere instead of getImplicitUses, getNumImplicitUses, getImplicitDefs and getNumImplicitDefs. A future patch will remove the old methods. In some use cases the new methods are less efficient because they always have to scan the whole uses/defs array to count its length, but that will be fixed in a future patch by storing the number of implicit uses/defs explicitly in MCInstrDesc. At that point there will be no need to 0-terminate the arrays. Differential Revision: https://reviews.llvm.org/D142215	2023-01-23 14:44:58 +00:00
Jay Foad	768aed1378	[MC] Make more use of MCInstrDesc::operands. NFC. Change MCInstrDesc::operands to return an ArrayRef so we can easily use it everywhere instead of the (IMHO ugly) opInfo_begin and opInfo_end. A future patch will remove opInfo_begin and opInfo_end. Also use it instead of raw access to the OpInfo pointer. A future patch will remove this pointer. Differential Revision: https://reviews.llvm.org/D142213	2023-01-23 11:31:41 +00:00
Jeremy Morse	9f8544713a	[DebugInfo] Store instr-ref mode of MachineFunction in member Add a flag state (and a MIR key) to MachineFunctions indicating whether they contain instruction referencing debug-info or not. Whether DBG_VALUEs or DBG_INSTR_REFs are used needs to be determined by LiveDebugValues at least, and using the current optimisation level as a proxy is proving unreliable. Test updates are purely adding the flag to tests, in a couple of cases it involves separating out VarLocBasedLDV/InstrRefBasedLDV tests into separate files, as they can no longer share the same input. Differential Revision: https://reviews.llvm.org/D141387	2023-01-20 14:47:11 +00:00
Craig Topper	e72ca520bb	[CodeGen] Remove uses of Register::isPhysicalRegister/isVirtualRegister. NFC Use isPhysical/isVirtual methods. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D141715	2023-01-13 14:38:08 -08:00
Dmitri Gribenko	05d722a11d	[llvm] Fix an "unused variable" warning when assertions are disabled	2023-01-09 11:33:05 +01:00
Stephen Tozer	da0faa0594	[DebugInfo] Produce variadic DBG_INSTR_REFs from ISel This patch modifies SelectionDAG and FastISel to produce DBG_INSTR_REFs with variadic expressions, and produce DBG_INSTR_REFs for debug values with variadic location expressions. The former essentially means just prepending DW_OP_LLVM_arg, 0 to the existing expression. The latter is achieved in MachineFunction::finalizeDebugInstrRefs and InstrEmitter::EmitDbgInstrRef. Reviewed By: jmorse, Orlando Differential Revision: https://reviews.llvm.org/D133929	2023-01-09 08:58:33 +00:00
Stephen Tozer	c383f4d655	[DebugInfo] Allow non-stack_value variadic expressions and use in DBG_INSTR_REF Prior to this patch, variadic DIExpressions (i.e. ones that contain DW_OP_LLVM_arg) could only be created by salvaging debug values to create stack value expressions, resulting in a DBG_VALUE_LIST being created. As of the previous patch in this patch stack, DBG_INSTR_REF's syntax has been changed to match DBG_VALUE_LIST in preparation for supporting variadic expressions. This patch adds some minor changes needed to allow variadic expressions that aren't stack values to exist, and allows variadic expressions that are trivially reduceable to non-variadic expressions to be handled similarly to non-variadic expressions. Reviewed by: jmorse Differential Revision: https://reviews.llvm.org/D133926	2023-01-06 19:31:10 +00:00
Stephen Tozer	e10e936315	[DebugInfo][NFC] Add new MachineOperand type and change DBG_INSTR_REF syntax This patch makes two notable changes to the MIR debug info representation, which result in different MIR output but identical final DWARF output (NFC w.r.t. the full compilation). The two changes are: * The introduction of a new MachineOperand type, MO_DbgInstrRef, which consists of two unsigned numbers that are used to index an instruction and an output operand within that instruction, having a meaning identical to first two operands of the current DBG_INSTR_REF instruction. This operand is only used in DBG_INSTR_REF (see below). * A change in syntax for the DBG_INSTR_REF instruction, shuffling the operands to make it resemble DBG_VALUE_LIST instead of DBG_VALUE, and replacing the first two operands with a single MO_DbgInstrRef-type operand. This patch is the first of a set that will allow DBG_INSTR_REF instructions to refer to multiple machine locations in the same manner as DBG_VALUE_LIST. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D129372	2023-01-06 18:03:48 +00:00
Jay Foad	e73b35699b	[SelectionDAG] Fix EmitCopyFromReg for cloned nodes Change EmitCopyFromReg to check all users of cloned nodes (as well as non-cloned nodes) instead of assuming that they all copy the defined value back to the same physical register. This partially reverts 968e2e7b3db1 (svn r62356) which claimed: CreateVirtualRegisters does trivial copy coalescing. If a node def is used by a single CopyToReg, it reuses the virtual register assigned to the CopyToReg. This won't work for SDNode that is a clone or is itself cloned. Disable this optimization for those nodes or it can end up with non-SSA machine instructions. This is true for CreateVirtualRegisters but r62356 also updated EmitCopyFromReg where it is not true. Firstly EmitCopyFromReg only coalesces physical register copies, so the concern about SSA form does not apply. Secondly making the loop over users in EmitCopyFromReg conditional on `!IsClone && !IsCloned` breaks the handling of cloned nodes, because it leaves MatchReg set to true by default, so it assumes that all users will copy the defined value back to the same physical register instead of actually checking. Differential Revision: https://reviews.llvm.org/D140417	2022-12-21 10:44:45 +00:00
Sander de Smalen	02df03c5b7	[AArch64][SME] Add support for arm_locally_streaming functions. Functions with `aarch64_sme_pstatesm_body` will emit a SMSTART at the start of the function, and a SMSTOP at the end of the function, such that all operations use the right value for vscale. Because the placement of these nodes is critically important (i.e. no vscale-dependent operations should be done before SMSTART has been issued), we require glueing the CopyFromReg to the Entry node such that we can insert the SMSTART as part of that glued chain. More details about the SME attributes and design can be found in D131562. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D131582	2022-10-14 13:47:53 +00:00
Xiang1 Zhang	aad013de41	[InlineAsm][bugfix] Correct function addressing in inline asm In Linux PIC model, there are 4 cases about value/label addressing: Case 1: Function call or Label jmp inside the module. Case 2: Data access (such as global variable, static variable) inside the module. Case 3: Function call or Label jmp outside the module. Case 4: Data access (such as global variable) outside the module. Due to current llvm inline asm architecture designed to not "recognize" the asm code, there are quite troubles for us to treat mem addressing differently for same value/adress used in different instuctions. For example, in pic model, call a func may in plt way or direclty pc-related, but lea/mov a function adress may use got. This patch fix/refine the case 1 and case 2 in inline asm. Due to currently inline asm didn't support jmp the outsider lable, this patch mainly focus on fix the function call addressing bugs in inline asm. Reviewed By: Pengfei, RKSimon Differential Revision: https://reviews.llvm.org/D133914	2022-10-14 09:47:26 +08:00
Sami Tolvanen	cff5bef948	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Relands 67504c95494ff05be2a613129110c9bcf17f6c13 with a fix for 32-bit builds. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 22:41:38 +00:00
Sami Tolvanen	a79060e275	Revert "KCFI sanitizer" This reverts commit 67504c95494ff05be2a613129110c9bcf17f6c13 as using PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.	2022-08-24 19:30:13 +00:00
Sami Tolvanen	67504c9549	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 18:52:42 +00:00
Craig Topper	e6c7a3a54f	[SelectionDAG] Don't apply MinRCSize constraint in InstrEmitter::AddRegisterOperand for IMPLICIT_DEF sources. MinRCSize is 4 and prevents constrainRegClass from changing the register class if the new class has size less than 4. IMPLICIT_DEF gets a unique vreg for each use and will be removed by the ProcessImplicitDef pass before register allocation. I don't think there is any reason to prevent constraining the virtual register to whatever register class the use needs. The attached test case was previously creating a copy of IMPLICIT_DEF because vrm8nov0 has 3 registers in it. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D128005	2022-06-16 14:55:14 -07:00
Jeremy Morse	fb6596f1ec	[DebugInfo][InstrRef] Avoid a crash from mixed variable location modes Variable locations now come in two modes, instruction referencing and DBG_VALUE. At -O0 we pick DBG_VALUE to allow fast construction of variable information. Unfortunately, SelectionDAG edits the optimisation level in the presence of opt-bisect-limit, meaning different passes have different views of what variable location mode we should use. That causes assertions when they're mixed. This patch plumbs through a boolean in SelectionDAG from start to instruction emission, so that we don't rely on the current optimisation level for correctness. Differential Revision: https://reviews.llvm.org/D123033	2022-04-06 11:55:38 +01:00
serge-sans-paille	ed98c1b376	Cleanup includes: DebugInfo & CodeGen Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121332	2022-03-12 17:26:40 +01:00
serge-sans-paille	ffe8720aa0	Reduce dependencies on llvm/BinaryFormat/Dwarf.h This header is very large (3M Lines once expended) and was included in location where dwarf-specific information were not needed. More specifically, this commit suppresses the dependencies on llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used, this has a decent impact on number of preprocessed lines generated during compilation of LLVM, as showcased below. This is achieved by moving some definitions back to the .cpp file, no performance impact implied[0]. As a consequence of that patch, downstream user may need to manually some extra files: llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h In some situations, codes maybe relying on the fact that llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden dependency now needs to be explicit. $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 10978519 before: 11245451 Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup [0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions Differential Revision: https://reviews.llvm.org/D118781	2022-02-04 11:44:03 +01:00
Jeremy Morse	a20987adf4	[DebugInfo][InstrRef] Add indirection from dbg.declare in SelectionDAG Usually dbg.declares get translated into either entries in an MF side-table, or a DBG_VALUE on entry to the function with IsIndirect set (including in instruction referencing mode). Much rarer is a dbg.declare attached to a non-argument value, such as in the test added in this patch where there's a variable-length-array. Such dbg.declares become SDDbgValue nodes with InIndirect=true. As it happens, we weren't correctly emitting DBG_INSTR_REFs with the additional indirection. This patch adds the extra indirection, encoded as adding an additional DW_OP_deref to the expression. Differential Revision: https://reviews.llvm.org/D114440	2021-11-29 22:24:19 +00:00
Jack Andersen	bd4dad87f4	[MachineInstr] Move MIParser's DBG_VALUE RegState::Debug invariant into MachineInstr::addOperand Based on the reasoning of D53903, register operands of DBG_VALUE are invariably treated as RegState::Debug operands. This change enforces this invariant as part of MachineInstr::addOperand so that all passes emit this flag consistently. RegState::Debug is inconsistently set on DBG_VALUE registers throughout LLVM. This runs the risk of a filtering iterator like MachineRegisterInfo::reg_nodbg_iterator to process these operands erroneously when not parsed from MIR sources. This issue was observed in the development of the llvm-mos fork which adds a backend that relies on physical register operands much more than existing targets. Physical RegUnit 0 has the same numeric encoding as $noreg (indicating an undef for DBG_VALUE). Allowing debug operands into the machine scheduler correlates $noreg with RegUnit 0 (i.e. a collision of register numbers with different zero semantics). Eventually, this causes an assert where DBG_VALUE instructions are prohibited from participating in live register ranges. Reviewed By: MatzeB, StephenTozer Differential Revision: https://reviews.llvm.org/D110105	2021-10-07 16:08:52 +01:00
Jeremy Morse	0116ed0069	[DebugInfo][InstrRef] Don't use instr-ref for unoptimised functions InstrRefBasedLDV is marginally slower than VarlocBasedLDV when analysing optimised code -- however, it's much slower when analysing code compiled -O0. To avoid this: don't use instruction referencing for -O0 functions. In the "pure" case of unoptimised code, this won't really harm the debugging experience because most variables won't have been promoted off the stack, so can't go missing. It becomes more complicated when optimised code is inlined into functions marked optnone; however these are rare, and as -O0 doesn't run many optimisations there should be little damage to the debug experience as a result. I've taken the opportunity to refactor testing for instruction-referencing into a MachineFunction method, which seems the most appropriate place to put it. Differential Revision: https://reviews.llvm.org/D108585	2021-08-25 15:10:36 +01:00
Paul Robinson	75aa3d520d	Add a DIExpression const-folder to prevent silly expressions. It's entirely possible (because it actually happened) for a bool variable to end up with a 256-bit DW_AT_const_value. This came about when a local bool variable was initialized from a bitfield in a 32-byte struct of bitfields, and after inlining and constant propagation, the variable did have a constant value. The sequence of optimizations had it carrying "i256" values around, but once the constant made it into the llvm.dbg.value, no further IR changes could affect it. Technically the llvm.dbg.value did have a DIExpression to reduce it back down to 8 bits, but the compiler is in no way ready to emit an oversized constant and a DWARF expression to manipulate it. Depending on the circumstances, we had either just the very fat bool value, or an expression with no starting value. The sequence of optimizations that led to this state did seem pretty reasonable, so the solution I came up with was to invent a DWARF constant expression folder. Currently it only does convert ops, but there's no reason it couldn't do other ops if that became useful. This broke three tests that depended on having convert ops survive into the DWARF, so I added an operator that would abort the folder to each of those tests. Differential Revision: https://reviews.llvm.org/D106915	2021-08-05 06:14:40 -07:00
Jeremy Morse	2b2ffb7bdc	[DebugInfo][InstrRef][3/4] Produce DBG_INSTR_REFs for all variable locations This patch emits DBG_INSTR_REFs for two remaining flavours of variable locations that weren't supported: copies, and inter-block VRegs. There are still some locations that must be represented by DBG_VALUE such as constants, but they're mostly independent of optimisations. For variable locations that refer to values defined in different blocks, vregs are allocated before isel begins, but the defining instruction might not exist until late in isel. To get around this, emit DBG_INSTR_REFs in a "half done" state, where the first operand refers to a VReg. Then at the end of isel, patch these back up to refer to instructions, using the finalizeDebugInstrRefs method. Copies are something that I complained about the original RFC, and I really don't want to have to put instruction numbers on copies. They don't define a value: they move them. To address this isel, salvageCopySSA interprets: * COPYs, * SUBREG_TO_REG, * Anything that isCopyInstr thinks is a copy. And follows chains of copies back to the defining instruction that they read from. This relies on any physical registers that COPYs read being defined in the same block, or being entry-block arguments. For the former we can put an instruction number on the defining instruction; for the latter we can drop a DBG_PHI that reads the incoming value. Differential Revision: https://reviews.llvm.org/D88896	2021-07-06 18:31:38 +01:00
Simon Pilgrim	114e712c34	InstrEmitter.cpp - don't dereference a dyn_cast<>. dyn_cast<> can return nullptr which we would then dereference - use cast<> which will assert that the type is correct.	2021-06-08 17:59:04 +01:00
Jonas Paulsson	d058262b14	[SystemZ] Support i128 inline asm operands. Support virtual, physical and tied i128 register operands in inline assembly. i128 is on SystemZ not really supported and is not a legal type and generally such a value will be split into two i64 parts. There are however some instructions that require a pair of two GPR64 registers contained in the GR128 bit reg class, which is untyped. For inline assmebly operands, it proved to be very cumbersome to first follow the general behavior of splitting an i128 operand into two parts and then later rebuild the INLINEASM MI to have one GR128 register. Instead, some minor common code changes were made to SelectionDAGBUilder to only create one GR128 register part to begin with. In particular: - getNumRegisters() now has an optional parameter "RegisterVT" which is passed by AddInlineAsmOperands() and GetRegistersForValue(). - The bitcasting in GetRegistersForValue is not performed if RegVT is Untyped. - The RC for a tied use in AddInlineAsmOperands() is now computed either from the tied def (virtual register), or by getMinimalPhysRegClass() (physical register). - InstrEmitter.cpp:EmitCopyFromReg() has been fixed so that the register class (DstRC) can also be computed for an illegal type. In the SystemZ backend getNumRegisters(), splitValueIntoRegisterParts() and joinRegisterPartsIntoValue() have been implemented to handle i128 operands. Differential Revision: https://reviews.llvm.org/D100788 Review: Ulrich Weigand	2021-05-26 10:08:32 -05:00
gbtozers	5491a86f59	[DebugInfo] Emit DBG_VALUE_LIST from ISel This patch completes ISel support for DIArgList dbg.values by allowing SDDbgValues with multiple location operands to be emitted as DBG_VALUE_LIST instructions. The primary change of this patch is refactoring EmitDbgValue by pulling location operand emission out to the new function AddDbgValueLocationOps, which is used for both DIArgList and single value dbg.values. Outside of that, the only behaviour change is that the scheduler has a lambda added, HasUnknownVReg, to prevent us from attempting to emit a DBG_VALUE_LIST before all of its used VRegs have become available. Differential Revision: https://reviews.llvm.org/D88592	2021-03-09 12:17:39 +00:00
gbtozers	9525af7b91	[DebugInfo] Support representation of multiple location operands in SDDbgValue This patch modifies the class that represents debug values during ISel, SDDbgValue, to support multiple location operands (to represent a dbg.value that uses a DIArgList). Part of this class's functionality has been split off into a new class, SDDbgOperand. The new class SDDbgOperand represents a single value, corresponding to an SSA value or MachineOperand in the IR and MIR respectively. Members of SDDbgValue that were previously related to that specific value (as opposed to the variable or DIExpression), such as the Kind enum, have been moved to SDDbgOperand. SDDbgValue now contains an array of SDDbgOperand instead, allowing it to hold more than one of these values. All changes outside SDDbgValue are simply updates to use the new interface. Differential Revision: https://reviews.llvm.org/D88585	2021-03-08 18:45:17 +00:00
Hongtao Yu	24d4291ca7	[CSSPGO] Pseudo probes for function calls. An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work. One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution. With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst. To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use. Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`. Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91756	2020-12-02 13:45:20 -08:00
Hongtao Yu	d0e42037bf	[CSSPGO] MIR target-independent pseudo instruction for pseudo-probe intrinsic This change introduces a MIR target-independent pseudo instruction corresponding to the IR intrinsic llvm.pseudoprobe for pseudo-probe block instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. An `llvm.pseudoprobe` intrinsic call will be lowered into a target-independent operation named `PSEUDO_PROBE`. Given the following instrumented IR, ``` define internal void @foo2(i32 %x, void (i32)* %f) !dbg !4 { bb0: %cmp = icmp eq i32 %x, 0 call void @llvm.pseudoprobe(i64 837061429793323041, i64 1) br i1 %cmp, label %bb1, label %bb2 bb1: call void @llvm.pseudoprobe(i64 837061429793323041, i64 2) br label %bb3 bb2: call void @llvm.pseudoprobe(i64 837061429793323041, i64 3) br label %bb3 bb3: call void @llvm.pseudoprobe(i64 837061429793323041, i64 4) ret void } ``` the corresponding MIR is shown below. Note that block `bb3` is duplicated into `bb1` and `bb2` where its probe is duplicated too. This allows for an accurate execution count to be collected for `bb3`, which is basically the sum of the counts of `bb1` and `bb2`. ``` bb.0.bb0: frame-setup PUSH64r undef $rax, implicit-def $rsp, implicit $rsp TEST32rr killed renamable $edi, renamable $edi, implicit-def $eflags PSEUDO_PROBE 837061429793323041, 1, 0 $edi = MOV32ri 1, debug-location !13; test.c:0 JCC_1 %bb.1, 4, implicit $eflags bb.2.bb2: PSEUDO_PROBE 837061429793323041, 3, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ bb.1.bb1: PSEUDO_PROBE 837061429793323041, 2, 0 PSEUDO_PROBE 837061429793323041, 4, 0 $rax = frame-destroy POP64r implicit-def $rsp, implicit $rsp RETQ ``` The target op PSEUDO_PROBE will be converted into a piece of binary data by the object emitter with no machine instructions generated. This is done in a different patch. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86495	2020-11-20 10:52:43 -08:00
Jeremy Morse	c4e7857d4e	[DebugInstrRef] Create DBG_INSTR_REFs in SelectionDAG When given the -experimental-debug-variable-locations option (via -Xclang or to llc), have SelectionDAG generate DBG_INSTR_REF instructions instead of DBG_VALUE. For now, this only happens in a limited circumstance: when the value referred to is not a PHI and is defined in the current block. Other situations introduce interesting problems, addresed in later patches. Practically, this patch hooks into InstrEmitter and if it can find a defining instruction for a value, gives it an instruction number, and points the DBG_INSTR_REF at that <instr, operand> pair. Differential Revision: https://reviews.llvm.org/D85747	2020-10-14 14:24:08 +01:00
Denis Antrushin	c08d48fc2d	[Statepoints] Change statepoint machine instr format to better suit VReg lowering. Current Statepoint MI format is this: STATEPOINT <id>, <num patch bytes >, <num call arguments>, <call target>, [call arguments...], <StackMaps::ConstantOp>, <calling convention>, <StackMaps::ConstantOp>, <statepoint flags>, <StackMaps::ConstantOp>, <num deopt args>, [deopt args...], <gc base/derived pairs...> <gc allocas...> Note that GC pointers are listed in pairs <base,derived>. This causes base pointers to appear many times (at least twice) in instruction, which is bad for us when VReg lowering is ON. The problem is that machine operand tiedness is 1-1 relation, so it might look like this: %vr2 = STATEPOINT ... %vr1, %vr1(tied-def0) Since only one instance of %vr1 is tied, that may lead to incorrect codegen (see PR46917 for more details), so we have to always spill base pointers. This mostly defeats new VReg lowering scheme. This patch changes statepoint instruction format so that every gc pointer appears only once in operand list. That way they all can be tied. Additional set of operands is added to preserve base-derived relation required to build stackmap. New statepoint has following format: STATEPOINT <id>, <num patch bytes>, <num call arguments>, <call target>, [call arguments...], <StackMaps::ConstantOp>, <calling convention>, <StackMaps::ConstantOp>, <statepoint flags>, <StackMaps::ConstantOp>, <num deopt args>, [deopt args...], <StackMaps::ConstantOp>, <num gc pointers>, [gc pointers...], <StackMaps::ConstantOp>, <num gc allocas>, [gc allocas...] <StackMaps::ConstantOp>, <num entries in gc map>, [base/derived indices...] Changes are: - every gc pointer is listed only once in a flat length-prefixed list; - alloca list is prefixed with its length too; - following alloca list is length-prefixed list of base-derived indices of pointers from gc pointer list. Note that indices are logical (number of pointer), not absolute (index of machine operand). Differential Revision: https://reviews.llvm.org/D87154	2020-10-06 17:40:29 +07:00
Denis Antrushin	2a52c3301a	[Statepoints] Properly handle const base pointer. Current code in InstEmitter assumes all GC pointers are either VRegs or stack slots - hence, taking only one operand. But it is possible to have constant base, in which case it occupies two machine operands. Add a convinience function to StackMaps to get index of next meta argument and use it in InsrEmitter to properly advance to the next statepoint meta operand. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D87252	2020-09-09 14:07:00 +07:00
Philip Reames	3da1a9634e	[Statepoints] Support lowering gc relocations to virtual registers (Disabled under flag for the moment) This is part of a larger project wherein we are finally integrating lowering of gc live operands with the register allocator. Today, we force spill all operands in SelectionDAG. The code to do so is distinctly non-optimal. The approach this patch is working towards is to instead lower the relocations directly into the MI form, and let the register allocator pick which ones get spilled and which stack slots they get spilled to. In terms of performance, the later part is actually more important as it avoids redundant shuffling of values between stack slots. This particular change adds ISEL support to produce the variadic def STATEPOINT form required by the above. In particular, the first N are lowered to variadic tied def/use pairs. So new statepoint looks like this: reloc1,reloc2,... = STATEPOINT ..., base1, derived1<tied-def0>, base2, derived2<tied-def1>, ... N is limited by the maximal number of tied registers machine instruction can have (15 at the moment). The current patch is restricted to handling relocations within a single basic block. Cross block relocations (e.g. invokes) are handled via the legacy mechanism. This restriction will be relaxed in future patches. Patch By: dantrushin Differential Revision: https://reviews.llvm.org/D81648	2020-07-25 14:26:05 -07:00
Simon Pilgrim	fe0006c882	TargetLowering.h - remove unnecessary TargetMachine.h include. NFC Replace with forward declaration and move dependency down to source files that actually need it. Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.	2020-05-23 19:49:38 +01:00
Craig Topper	8c72b0271b	[CodeGen] Use Align in MachineConstantPool.	2020-05-12 10:06:40 -07:00
Craig Topper	bebdc62c3f	[SelectionDAG] Remove ConstantPoolSDNode::getAlignment. Use getAlign instead. Differential Revision: https://reviews.llvm.org/D79459	2020-05-08 16:04:11 -07:00
Simon Pilgrim	032738d17e	InstrEmitter.h - reduce SelectionDAG.h include to SelectionDAGNodes.h include. Add SDDbgLabel/TargetLowering forward declarations. Add the full SelectionDAG.h include to InstrEmitter.cpp.	2020-04-19 11:52:31 +01:00

1 2 3 4 5 ...

261 Commits