llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-03 07:36:07 +00:00

Author	SHA1	Message	Date
Felipe de Azevedo Piovezan	35f4ef1fee	[SelectionDAG][DebugInfo] Handle entry_value dbg.value DIExprs earlier When SelectiondDAG converts dbg.value intrinsics, it first ensures we have already generated code for the value operator of the intrinsic. The rationale being that if we haven't had the need to generate code for this value, it won't be a debug value that causes the generation. For example, if the first use the physical register of an argument is a dbg.value, we are going to hit this code path. However, this is irrelevant for entry value expressions: by definition we are not interested in the _current_ value of the physical register, but rather on its value at the start of the function. To deal with this, this patch changes lowering to handle this case as early as possible. Differential Revision: https://reviews.llvm.org/D158649	2023-08-24 09:33:53 -04:00
Matt Arsenault	d86a7d631c	GlobalISel: Add constant fold combine for zext/sext/anyext Could use more work for vectors. https://reviews.llvm.org/D156534	2023-08-24 08:10:01 -04:00
Serge Pavlov	6862f0fab1	[FPEnv] Intrinsics for access to FP control modes The change introduces intrinsics 'get_fpmode', 'set_fpmode' and 'reset_fpmode'. They manage all target dynamic floating-point control modes, which include, for instance, rounding direction, precision, treatment of denormals and so on. The intrinsics do the same operations as the C library functions 'fegetmode' and 'fesetmode'. By default they are lowered to calls to these functions. Two main use cases are supported by this implementation. 1. Local modification of the control modes. In this case the code usually has a pattern (in pseudocode): saved_modes = get_fpmode() set_fpmode(<new_modes>) ... <do operations under the new modes> ... set_fpmode(saved_modes) In the case when it is known that the current FP environment is default, the code may be shorter: set_fpmode(<new_modes>) ... <do operations under the new modes> ... reset_fpmode() Such patterns appear not only in user code but also in implementations of various FP controlling pragmas. In particular, the implementation of `#pragma STDC FENV_ROUND` requires similar code if the target does not support static rounding mode. 2. Portable control of FP modes. Usually FP control modes are set by writing to some control register. Different targets have different layout of this register, the way the register is accessed also may be different. Using set of target-specific definitions for the control register bits together with these intrinsic functions provides enough portable way to handle control modes across wide range of hardware. This change defines only llvm intrinsic function, which implement the access required for the aforementioned use cases. Differential Revision: https://reviews.llvm.org/D82525	2023-08-24 15:52:19 +07:00
Craig Topper	2ad50f354a	[DAGCombiner][RISCV][AArch64][PowerPC] Restrict foldAndOrOfSETCC from using SMIN/SMAX where and OR/AND would do. This removes some diffs created by D153502. I'm assuming an AND/OR won't be worse than an SMIN/SMAX. For RISC-V at least, AND/OR can be a shorter encoding than SMIN/SMAX. It's weird that we have two different functions responsible for folding logic of setccs, but I'm not ready to try to untangle that. I'm unclear if the PowerPC chang is a regression or not. It looks like it might use more registers, but I don't understand PowerPC register so I'm not sure. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158292	2023-08-23 20:26:23 -07:00
Yingwei Zheng	d6639f83a9	[SDAG][RISCV] Avoid folding `setcc (xor C1, -1), C2, cond` into `setcc (xor C2, -1), C1, cond` This patch fixes https://github.com/llvm/llvm-project/issues/64935. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158654	2023-08-24 04:18:17 +08:00
Peter Rong	f58fbfc746	[X86][CodeGen] Add a dag pattern to fix #64323 After recent patch D30189, #64323's error message become a new one. When DAGCombiner was optimizing `(vextract (scalar_to_vector val, 0) -> val`, it didn't consider the possibility that the inserted value type has less bit than the dest type. This patch fixes that. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D158355	2023-08-23 10:50:32 -07:00
Rahman Lavaee	7dc6566273	Add file header for GCEmptyBasicBlocks.cpp.	2023-08-23 17:45:09 +00:00
David Green	adaf545a50	[GlobalISel] Limit shift_of_shifted_logic_chain to non-zero folds After D157690 we are seeing some crashes from Global ISel, which seem to be related to the shift_of_shifted_logic_chain combine that can remove too many instructions if the shift amount is zero. This limits the fold to non-zero shifts, under the assumption that it is better in that case to fold away the shift to a COPY. Differential Revision: https://reviews.llvm.org/D158596	2023-08-23 18:17:37 +01:00
Felipe de Azevedo Piovezan	af6d43ea66	[AsmPrinter][DebugInfo] Create EntryValue mode for DbgVariable With D149881, we converted EntryValue MachineFunction table entries into `DbgVariables` initialized by a "DbgValue" intrinsic, which can only handle a single, non-fragment DIExpression. However, it is desirable to handle variables with multiple fragments and DIExpressions. To do this, we expand the `DbgVariable` class to handle the EntryValue case. This class can already operate under three different "modes" (stack slot, unchanging location described by a dbg value, changing location described by a loc list). A fourth case is added as a separate class entirely, but a subsequent patch should redesign `DbgVariable` with four subclasses in order to make the code more readable. This patch also exposed a bug in the `beginEntryValueExpression` function, which was not initializing the `LocationFlags` properly. Note how the `finalizeEntryValue` function resets that flag. We fix this bug here, as testing this changing in isolation would be tricky. Differential Revision: https://reviews.llvm.org/D158458	2023-08-23 12:29:18 -04:00
Felipe de Azevedo Piovezan	88417098bb	[CodeGen][DebugInfo] Append OP_deref when converting an EntryValue dbg.declare When we convert an EntryValue dbg.declare into an entry of the MF side table, we currently copy its DIExpression as is, and rely on subsequent layers to "know" that this expression is implicitly indirect. This is bad because it adds an implicit assumption to the IR representation, and requires subsequent layers to know about this assumption. This also limits the reusability of this table: what if, in the future, we want to use this table for dbg.values? This patch changes existing behavior so that the entities converting dbg_declares explicitly add an OP_deref when converting EntryValue dbg.declares. Differential Revision: https://reviews.llvm.org/D158437	2023-08-23 12:25:12 -04:00
David Green	ef0b8cf3f4	[AArch64][GISel] Expand coverage of FAdd and FSub. This adds some more extensive test coverage for fadd/fsub through global isel, switching the opcodes to use the more complete ActionDefinitions to handle more cases.	2023-08-23 09:51:06 +01:00
Jianjian GUAN	879e801a91	[RISCV] Apply promotion for f16 vector ops when only have zvfhmin For most fp16 vector ops, we could promote it to fp32 vector when zvfhmin is enable but zvfh is not. But for nxv32f16, we need to split it first since nxv32f32 is not a valid MVT. Reviewed By: michaelmaitland Differential Revision: https://reviews.llvm.org/D153848	2023-08-23 16:49:20 +08:00
Rahman Lavaee	d0ec03a384	Revert "[BasicBlockSections] avoid insertting redundant branch to fall through blocks" This reverts commit ab53109166c0345a79cbd6939cf7bc764a982856 which was commited by mistake.	2023-08-23 01:09:13 +00:00
Rahman Lavaee	ab53109166	[BasicBlockSections] avoid insertting redundant branch to fall through blocks	2023-08-22 23:32:02 +00:00
Rahman Lavaee	e280e406c2	Add a pass to garbage-collect empty basic blocks after code generation. Propeller and pseudo-probes map profiles back to Machine IR via basic block addresses that are stored in metadata sections. Empty basic blocks (basic blocks without real code) obfuscate the profile mapping because their addresses collide with their next basic blocks. For instance, the fallthrough block of an empty block should always be adjacent to it. Otherwise, a completely unnecessary jump would be added. This patch adds a MachineFunction pass named `GCEmptyBasicBlocks` which attempts to garbage-collect the empty blocks before the `BasicBlockSections` and pass. This pass removes each empty basic block after redirecting its incoming edges to its fall-through block. The garbage-collection is not complete. We keep the empty block in 4 cases: 1. The empty block is an exception handling pad. 2. The empty block has its address taken. 3. The empty block is the last block of the function and it has predecessors. 4. The empty block is the only block of the function. The first three cases are extremely rare in normal code (no cases for the clang binary). Removing the blocks under the first two cases requires modifying exception handling structures and operands of non-terminator instructions -- which is doable but not worth the additional complexity in the pass. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D107534	2023-08-22 22:42:19 +00:00
Daniel Hoekwater	90ab85a1b2	Reland "[CodeGen][AArch64] Make MFS testable on AArch64" Reverted by 3d22dac6c3b97d7bb92f243886dfb0d32a5c42e9 because it depended on b9d079d6188b50730e0a67267b7fee36008435ce, which broke some tests.	2023-08-22 20:21:33 +00:00
pvanhout	2d87319f06	[GlobalISel] Rewrite some simple rules using MIR Patterns Rewrites some simple rules that cause little to no codegen regressions as MIR patterns. I may have missed some easy cases, but some other rules have intentionally been left as-is because bigger changes are needed to make them work. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157690	2023-08-22 09:09:54 +02:00
Fangrui Song	77596e6b16	Revert D157750 "[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation." This reverts commit 317a0fe5bd7113c0ac9d30b2de58ca409e5ff754. This reverts commit 30c4b97aec60895a6905816670f493cdd1d7c546. See post-commit discussions on https://reviews.llvm.org/D157750 that we should use a different mechanism to handle the error with --cuda-gpu-arch= The IR/DiagnosticInfo.cpp, warn_drv_for_elf_only, codegne tests in clang/test/Driver, and the following driver behavior (downgrading error to warning) changes are undesired. ``` % clang --target=riscv64 -fsplit-machine-functions -c a.c warning: -fsplit-machine-functions is not valid for riscv64 [-Wbackend-plugin] ```	2023-08-21 13:54:15 -07:00
Felipe de Azevedo Piovezan	32223123d3	[DwarfDebug][NFC] Factor out 'isInitialized' logic The class 'DbgVariable' can be in one of three states, and the "is any of them initialization" logic for them is repeated in a couple of places. We may want to expand this class in the future; as such, we factor out this common logic so that it is easier to modify. Differential Revision: https://reviews.llvm.org/D158438	2023-08-21 15:15:14 -04:00
Craig Topper	e620eac75e	[SelectionDAG][RISCV][SVE] Harden fixed offset version of ComputeValueVTs against scalable offsets. Use getFixedValue instead of getKnownMinValue to convert TypeSize to uint64_t. I believe this would have caught the bug fixed by D157872. To prevent false failures, I had to treat a scalable 0 as if it is fixed value. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D158115	2023-08-21 10:36:17 -07:00
Daniel Hoekwater	e223e45677	Reland "[AArch64][CodeGen] Avoid inverting hot branches during relaxation"" This is a reland of 46d2d7599d9ed5e68fb53e910feb10d47ee2667b, which was reverted because of breaking build https://lab.llvm.org/buildbot/#/builders/21/builds/78779. However, this buildbot is spuriously broken due to Flang::underscoring.f90 being nondeterministic.	2023-08-21 17:29:47 +00:00
Daniel Hoekwater	0303137bfc	Revert "[AArch64][CodeGen] Avoid inverting hot branches during relaxation" This reverts commit 46d2d7599d9ed5e68fb53e910feb10d47ee2667b. Breaks build https://lab.llvm.org/buildbot/#/builders/21/builds/78779	2023-08-21 17:13:35 +00:00
Daniel Hoekwater	46d2d7599d	[AArch64][CodeGen] Avoid inverting hot branches during relaxation Current behavior for relaxing out-of-range conditional branches is to invert the conditional and insert a fallthrough unconditional branch to the original destination. This approach biases the branch predictor in the wrong direction, which can degrading performance. Machine function splitting introduces many rarely-taken cross-section conditional branches, which are improperly relaxed. Avoid inverting these branches; instead, retarget them to trampolines at the end of the function. Doing so increases the runtime cost of jumping to cold code but eliminates the misprediction cost of jumping to hot code. Differential Revision: https://reviews.llvm.org/D156837	2023-08-21 16:41:02 +00:00
Benjamin Kramer	a4202e65cf	Move VTList pointer out of RegClassInfos Store it in TargetRegisterInfo instead. Worth 54k on llc size.	2023-08-21 17:40:40 +02:00
Simon Pilgrim	ba818c4019	[DAG] replaceStoreOfInsertLoad - don't fold if the inserted element is implicitly truncated D152276 wasn't handling the case where the inserted element is implicitly truncated into the vector - resulting in a i1 element (implicitly truncated from i8) overwriting 8 bits instead of 1 bit. This patch is intended to be merged into 17.x so I've just disallowed any vector element vs inserted element type mismatch - technically we could be more elegant and permit truncated stores (as long as the store is still byte sized), but the use cases for that are so limited I'd prefer to play it safe for now. Candidate patch for #64655 17.x merge Differential Revision: https://reviews.llvm.org/D158366	2023-08-21 11:22:07 +01:00
Tuan Chuong Goh	a40c984976	[AArch64][GlobalISel] Support more legal types for EXTEND Expand (s/z/any)ext instructions to be compatible with more types for GlobalISel. This patch mainly focuses on 64-bit and 128-bit vectors with element size of powers of 2. It also notably handles larger than legal vectors. Differential Revision: https://reviews.llvm.org/D157113	2023-08-21 09:51:17 +01:00
Kazu Hirata	134115618a	[CodeGen] Use isAllOnesConstant and isNullConstant (NFC)	2023-08-20 22:56:40 -07:00
Fangrui Song	41e71f500d	[GlobalISel] Remove unneeded empty check. NFC	2023-08-20 21:11:13 -07:00
Rahman Lavaee	69e47deca9	[Propeller] Deprecate Codegen paths for SHT_LLVM_BB_ADDR_MAP version 1. This patch removes the `getBBIDOrNumber` which was introduced to allow emitting version 1. Reviewed By: shenhan Differential Revision: https://reviews.llvm.org/D158299	2023-08-20 18:29:47 +00:00
Sameer Sahasrabuddhe	ef38e6d97f	[GlobalISel] introduce MIFlag::NoConvergent Some opcodes in MIR are defined to be convergent by the target by setting IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes G_SI_CALL and G_INTRINSIC* are marked as convergent. But this is too conservative, since calls to functions that do not execute convergent operations should not be marked convergent. This information is available in LLVM IR. The new flag MIFlag::NoConvergent now allows the IR translator to mark an instruction as not performing any convergent operations. It is relevant only on occurrences of opcodes that are marked isConvergent in the target. Differential Revision: https://reviews.llvm.org/D157475	2023-08-20 21:14:46 +05:30
Simon Pilgrim	95865e5138	[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively. Alive2: https://alive2.llvm.org/ce/z/MehvFB REAPPLIED from 54d663d5896008 with fix for using the correct DemandedBits mask.	2023-08-20 14:20:49 +01:00
Filipp Zhinkin	08d0b558f5	[SwiftError] Use IMPLICIT_DEF as a definition for unreachable VReg uses SwiftErrorValueTracking creates vregs at swifterror use sites and then connects it with appropriate definitions after instruction selection. To propagate swifterror values SwiftErrorValueTracking::propagateVRegs iterates over basic blocks in RPO, but some vregs previously created at use sites may be located in blocks that became unreachable after instruction selection. Because of that there will no definition for such vregs and that may cause issues down the pipeline. To ensure that all vregs created by the SwiftErrorValueTracking will be defined propagateVRegs was updated to insert IMPLICIT_DEF at the beginning of unreachable blocks containing swifterror uses. Related issue: https://github.com/llvm/llvm-project/issues/59751 Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D141053	2023-08-20 13:00:31 +02:00
Kazu Hirata	d85993d28f	[llvm] Remove redundant control flow statements (NFC)	2023-08-19 08:07:30 -07:00
Jim Lin	18f5ada244	[DAGCombiner] Don't reduce BUILD_VECTOR to BITCAST before LegalizeTypes if VT is legal. Targets may lose some optimization opportunities for certain vector operation if we reduce BUILD_VECTOR to BITCAST early. And if VT is not legal, reduce BUILD_VECTOR to BITCAST before LegailizeTypes can get benefit. Because type-legalizer often scalarizes illegal type of vectors. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D156645	2023-08-19 12:53:50 +08:00
Philip Reames	92e0c0dc1a	[DAG] Restrict insert_subvector undef, splat_veector, dontcare transform On the extract_subvector side, we already have the restriction. With D158201, we'd start getting unprofitable splat combines unless we add the same one on the extract_subvector side. Differential Revision: https://reviews.llvm.org/D158202	2023-08-18 12:44:09 -07:00
Daniel Hoekwater	ca72b0a709	[CodeGen] Use the TII hook for Noop insertion in BBSections (NFC) Refactor BasicBlockSections to use the target-specific noop insertion hook from TargetInstrInfo instead of building it ourselves. Using the TII hook is both cleaner and makes it easier to extend BBSections to non-X86 targets. Differential Revision: https://reviews.llvm.org/D158303	2023-08-18 19:40:11 +00:00
Philip Reames	67b71ad04a	[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types We have an existing DAG combine for when an insert/extract subvector pair is entirely a nop, but we hadn't handled the case where the net result was either an insert or an extract (but not both). The transform is restricted to index = 0 to avoid having to adjust indices after the transform. Differential Revision: https://reviews.llvm.org/D158201	2023-08-18 12:28:27 -07:00
Craig Topper	bbbb93eb48	Revert "[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types" This reverts commit 770be43f6782dab84d215d01b37396d63a9c2b6e. Forgot to remove from my tree while experimenting.	2023-08-18 12:00:07 -07:00
Craig Topper	0a5347f40d	[DAG] SimplifyDemandedBits - Use DemandedBits intead of OriginalDemandedBits to when simplifying UMIN/UMAX to AND/OR. DemandedBits is forced to all ones if there are multiple users. The changes X86 test cases looks like they were miscompiles before. The value of eax/rax from the cmov is returned from the function in addition to being used by the sar. That usage needs all bits even though the sar doesn't.	2023-08-18 11:59:18 -07:00
Craig Topper	770be43f67	[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types We have an existing DAG combine for when an insert/extract subvector pair is entirely a nop, but we hadn't handled the case where the net result was either an insert or an extract (but not both). The transform is restricted to index = 0 to avoid having to adjust indices after the transform. Reviews, a couple comments on the test changes: * Mostly RISCV, mostly schedule reordering. * One real regression in splats-with-mixed-vl.ll due to a different overly aggressive combine, fix in a follow up patch. * The test/CodeGen/X86/vector-replicaton-i1-mask.ll diff looked concerning at first, but not the mask size at most 4 i1s. I think the type changes on the mask loads are correct, but would welcome a second opinion with someone more familiar with AVX512 codegen. Differential Revision: https://reviews.llvm.org/D158201	2023-08-18 11:59:18 -07:00
Thurston Dang	29b2009061	Revert "[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively." This reverts commit 54d663d5896008c09c938f80357e2a056454bc65, which breaks the test CodeGen/SystemZ/ctpop-01.ll for stage2-ubsan check (see https://lab.llvm.org/buildbot/#/builders/85/builds/18410) I manually confirmed that the test had been passing immediately prior to that commit (BUILDBOT_REVISION=4772c66cfb00d60f8f687930e9dd3aa1b6872228 llvm-zorg/zorg/buildbot/builders/sanitizers/buildbot_bootstrap_ubsan.sh)	2023-08-18 18:08:10 +00:00
Simon Pilgrim	bd9bf9cb67	[X86] SimplifyDemandedBits - move MaskedValueIsZero as late as possible to avoid unnecessary (recursive) analysis costs. NFC. Mentioned on D155472 for the SHL equivalent	2023-08-18 15:14:06 +01:00
Simon Pilgrim	4cd1c07491	[DAG] SimplifyDemandedBits - if we're only demanding the msb, a UMIN/UMAX node can be simplified to a AND/OR node respectively. Alive2: https://alive2.llvm.org/ce/z/qnvmc6	2023-08-18 12:12:22 +01:00
Simon Pilgrim	54d663d589	[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively. Alive2: https://alive2.llvm.org/ce/z/MehvFB	2023-08-18 11:35:34 +01:00
Carl Ritson	ad9eed1e77	[MachineVerifier] Verify LiveIntervals for PHIs Implement basic support for verifying LiveIntervals for PHIs. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156872	2023-08-18 18:14:22 +09:00
Craig Topper	c6dee6982f	[GlobalISel][Mips] Sync G_UADDE and G_USUBE legalization with LegalizeDAG. This modifies the G_UADDE legalizaton to a version that looks shorter on Mips and RISC-V when feeding the equivalent IR to SelectionDAG. This also removes the boolean select from G_USUBE. Comments taken from LegalizeDAG and tweaked. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158232	2023-08-17 20:36:55 -07:00
Jie Fu	d1a4b8c56f	[GlobalISel] Remove unused variable 'Or' (NFC) /Users/jiefu/llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:3450:10: error: unused variable 'Or' [-Werror,-Wunused-variable] auto Or = MIRBuilder.buildOr(CarryOut, And, Res_ULT_LHS); ^ 1 error generated.	2023-08-18 06:40:41 +08:00
Craig Topper	846fbb06b8	[DAGCombiner][RISCV] Return SDValue(N, 0) instead of SDValue() after 2 calls to CombineTo in visitSTORE. RISC-V found a case where the CombineTo caused N to be CSEd with an existing node and then deleted. The top level DAGCombiner loop was surprised to find a node was deleted, but SDValue() was returned from the visit function. We need to return SDValue(N, 0) to tell the top level loop that a change was made, but the worklist updates were already handled. Fixes #64772. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158208	2023-08-17 15:13:36 -07:00
Craig Topper	ebb2e5ebb2	[GlobalISel][Mips] Correct corner case in G_UADDE legalization. If carryin was 1, and RHS is 0xffffffff we were not giving a carry out. In that case Res would be equal to LHS, so Res <u LHS would be false. But there should be a carry out since carryin+RHS wraps around to 0. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157943	2023-08-17 15:06:16 -07:00
Jeffrey Byrnes	d26a06728d	[DAG] NFC: Add getBitcastedExtOrTrunc Simple function which scalarizes Ops then ExtOrTruncs them according to function parameters Differential Revision: https://reviews.llvm.org/D157733 Change-Id: Ie5215069228f7bf530cd2dbb4bd17cbf409e046a	2023-08-17 14:29:17 -07:00

1 2 3 4 5 ...

34570 Commits