llvm-project

mirror of https://github.com/llvm/llvm-project.git synced 2025-05-02 22:06:05 +00:00

Author	SHA1	Message	Date
Felix (Ting Wang)	5b05870953	[PowerPC] Support local-dynamic TLS relocation on AIX (#66316 ) Supports TLS local-dynamic on AIX, generates below sequence of code: ``` .tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier .tc mh[TC],mh[TC]@ml # Module handle for the caller lwz 3,mh[TC]$2$ $$ For 64-bit: ld 3,mh[TC]$2$ bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0 #r3 = &TLS for module lwz 4,foo[TC]$2$ $$ For 64-bit: ld 4,foo[TC]$2$ add 5,3,4 # Compute &foo .rename mh[TC], "\_$TLSML" # Symbol for the module handle must have the name "_$TLSML" ``` --------- Co-authored-by: tingwang <tingwang@tingwangs-MBP.lan> Co-authored-by: tingwang <tingwang@tingwangs-MacBook-Pro.local>	2024-03-01 08:09:40 +08:00
Nico Weber	184ca39529	[llvm] Move CodeGenTypes library to its own directory (#79444 ) Finally addresses https://reviews.llvm.org/D148769#4311232 :) No behavior change.	2024-01-25 12:01:31 -05:00
Nick Desaulniers	330fa7d2a4	[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057 ) Given a list of constraints for InlineAsm (ex. "imr") I'm looking to modify the order in which they are chosen. Before doing so, I noticed a fair amount of logic is duplicated between SelectionDAGISel and GlobalISel for this. That is because SelectionDAGISel is also trying to lower immediates during selection. If we detangle these concerns into: 1. choose the preferred constraint 2. attempt to lower that constraint Then we can slide down the list of constraints until we find one that can be lowered. That allows the implementation to be shared between instruction selection frameworks. This makes it so that later I might only need to adjust the priority of constraints in one place, and have both selectors behave the same.	2023-09-25 08:53:03 -07:00
Nick Desaulniers	86735a4353	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66264 ) reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) This reverts commit ee643b706be2b6bef9980b25cc9cc988dab94bb5. Fix up build failures in targets I missed in #66003 Kept as 3 commits for reviewers to see better what's changed. Will squash when merging. - reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) - fix all the targets I missed in #66003 - fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll	2023-09-13 13:31:24 -07:00
Ting Wang	71be020dda	[SelectionDAG][PowerPC] Memset reuse vector element for tail store On PPC there are instructions to store element from vector(e.g. stxsdx/stxsiwx), and these instructions can be leveraged to avoid tail constant in memset and constant splat array initialization. This patch tries to explore these opportunities. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138883	2023-09-06 01:52:38 -04:00
Qiu Chaofan	21bea1a208	[PowerPC] Support initial-exec TLS relocation on AIX Add TLS_IE relocation type to XCOFF writer, and emit code sequence for initial-exec TLS variables. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D156292	2023-08-30 16:22:16 +08:00
Amy Kwan	f5ae075048	[AIX][TLS] Generate 32-bit local-exec access code sequence This patch adds support for the TLS local-exec access model on AIX to allow for the ability to generate the 32-bit (specifically, non-optimized) code sequence. This work is a follow up of D149722. The particular sequence that is generated for this sequence is as follows: ``` .tc var[TC],var[TL]@le. // variable offset, with the le relocation specifier bla .__get_tpointer() // get the thread pointer, modifies r3 lwz reg1, var[TC](2) // load the variable offset add reg2, r3, reg1 // add the variable offset to the retrieved thread pointer ``` Differential Revision: https://reviews.llvm.org/D152669	2023-06-20 11:57:38 -05:00
Amy Kwan	d5659808b2	[AIX][TLS] Generate 64-bit local-exec access code sequence This patch adds support for the TLS local-exec access model on AIX to allow for the ability to generate the 64-bit (specifically, non-optimized) code sequence. For this patch in particular, the sequence that is generated involves a load of the variable offset, followed by an add of the loaded variable offset to r13 (which is thread pointer, respectively). This code sequence looks like the following: ``` ld reg1,var[TC](2) add reg2, reg1, r13 // r13 contains the thread pointer ``` The TOC (.tc pseudo-op) entries generated in the assembly files are also changed where we add the @le relocation for the variable offset. Differential Revision: https://reviews.llvm.org/D149722	2023-06-19 12:17:30 -05:00
Qiu Chaofan	69bc8ff766	Reland "[PowerPC] Simplify fp-to-int store optimization" The build failure should be fixed by de681d53. Follow-up refactor will be done in future patches. This reverts commit e7c5ced0b9f0551ea17e1d2b48be86f03a772c59.	2023-06-05 13:53:08 +08:00
Vitaly Buka	e7c5ced0b9	Revert "[PowerPC] Simplify fp-to-int store optimization" Breaks https://lab.llvm.org/buildbot/#/builders/18/builds/9118 This reverts commit 8064caf83fb166b709bfe0e7641c5181341cb064.	2023-05-24 10:05:28 -07:00
Qiu Chaofan	8064caf83f	[PowerPC] Simplify fp-to-int store optimization On PowerPC VSX targets, fp-to-int will be transformed into xscv with mfvsr. When the result is to be stored, mfvsr can be replaced by a direct store. This change simplifies the optimization by using existing fp-to-int code, which helps CSE and handling strictfp cases. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D141473	2023-05-23 16:40:54 +08:00
NAKAMURA Takumi	c1221251fb	Restore CodeGen/MachineValueType.h from `Support` This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well. Depends on D148767 Differential Revision: https://reviews.llvm.org/D149024	2023-05-03 00:13:20 +09:00
Qiu Chaofan	5b8ea2d0e1	[PowerPC] Lower IS_FPCLASS by test data class instruction Power ISA 3.0 introduced new 'test data class' instructions, which accept flags for: NaN/Infinity/Zero/Denormal. This instruction can be used to implement custom lowering for llvm.is.fpclass, but some extra bits provided by the intrinsic are missing (normal and QNaN/SNaN). For those categories not natively supported, this patch uses a two-way or three-way combination to implement correct behavior. Reviewed By: sepavloff, shchenz Differential Revision: https://reviews.llvm.org/D140381	2023-04-03 11:37:17 +08:00
Craig Topper	219ff07f72	[Targets] Rename Flag->Glue. NFC Long long ago Glue was called Flag, and it was never completely renamed.	2023-04-02 19:28:51 -07:00
Simon Pilgrim	da570ef1b4	[DAG] Match select(icmp(x,y),sub(x,y),sub(y,x)) -> abd(x,y) patterns Pulled out of PowerPC, and added ABDS support as well (hence the additional v4i32 PPC matches) Differential Revision: https://reviews.llvm.org/D144789	2023-03-14 15:10:30 +00:00
Ting Wang	bd4562976c	[PowerPC][NFC] cleanup isEligibleForTCO The input parameter IsByValArg to isEligibleForTCO() is false in all cases, so it is considered redundant and should be removed. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D145028	2023-03-02 23:04:19 -05:00
Ting Wang	65f68812d3	[PowerPC] update PPCTTIImpl::supportsTailCallFor() check conditions This patch reuse `PPCTargetLowering::isEligibleForTCO()` to check `PPCTTIImpl::supportsTailCallFor()`. Fixes #59315 Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D140369	2023-02-28 22:29:16 -05:00
Simon Pilgrim	8757ce4901	[PowerPC] Replace PPCISD::VABSD cases with generic ISD::ABDU(X,Y) node A move towards using the generic ISD::ABDU nodes on more backends Also support ISD::ABDS for v4i32 types using the existing signbit flip trick PowerPC has a select(icmp_ugt(x,y),sub(x,y),sub(y,x)) -> abdu(x,y) combine that I intend to move to DAGCombiner in a future patch. The ABS(SUB(X,Y)) -> PPCISD::VABSD(X,Y,1) v4i32 combine wasn't legal (https://alive2.llvm.org/ce/z/jc2hLU) - so I've removed it, having already added the legal sub nsw tests equivalent. Differential Revision: https://reviews.llvm.org/D142313	2023-02-25 20:17:17 +00:00
Ting Wang	d567e06946	[PowerPC][NFC] refactor eligible check for tail call optimization The check logic for TCO is scattered in two functions: IsEligibleForTailCallOptimization_64SVR4() IsEligibleForTailCallOptimization(), and serves instruction selection phase only at this moment. This patch aims to refactor existing logic to export an API for TCO eligible query before instruction selection phase. Reviewed By: shchenz, nemanjai Differential Revision: https://reviews.llvm.org/D141673	2023-02-21 06:14:47 -05:00
Matt Arsenault	09dd4d870e	DAG: Remove hasBitPreservingFPLogic This doesn't make sense as an option. fneg and fabs are bit preserving by definition. If a target has some fneg or fabs instruction that are not bitpreserving it's incorrect to lower fneg/fabs to use it.	2023-02-14 10:25:24 -04:00
Qiu Chaofan	a40ef656d8	[Intrinsic] Rename flt.rounds intrinsic to get.rounding Address the inconsistency between FLT_ROUNDS_ and SET_ROUNDING SDAG node. Rename FLT_ROUNDS_ to GET_ROUNDING and add llvm.get.rounding intrinsic to replace flt.rounds. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D139507	2022-12-19 15:22:39 +08:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Krzysztof Parzyszek	864aaa21b4	TargetLowering: convert Optional to std::optional	2022-12-01 16:19:10 -08:00
Maryam Moghadas	934d5fa2b8	[PowerPC] Exploit xxperm, check for dead vectors and substitute vperm with xxperm vperm instruction requires the data to be in the Altivec registers, if one of the vector operands is not used after this vperm instruction then it can be substituted by xxperm which doubles the number of available registers. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D133700	2022-11-23 13:28:12 -06:00
Stanislav Mekhanoshin	bcaf31ec3f	[AMDGPU] Allow finer grain control of an unaligned access speed A target can return if a misaligned access is 'fast' as defined by the target or not. In reality there can be different levels of 'fast' and 'slow'. This patch changes the boolean 'Fast' argument of the allowsMisalignedMemoryAccesses family of functions to an unsigned representing its speed. A target can still define it as it wants and the direct translation of the current code uses 0 and 1 for current false and true. This makes the change an NFC. Subsequent patch will start using an actual value of speed in the load/store vectorizer to compare if a vectorized access going to be not just fast, but not slower than before. Differential Revision: https://reviews.llvm.org/D124217	2022-11-17 09:23:53 -08:00
Nemanja Ivanovic	4ea121c904	[PowerPC] Fix a number of inefficiencies and issues with atomic code gen There are a few issues with the code we generate for atomic operations and the way we generate it: - Hard coded CR0 for compares - Order of operands for compares not conducive to emitting compare-immediate or for CSE of compares - Missing MachineMemOperand for st[bhwd]cx intrinsics - Missing intrinsic properties for the same - Unnecessary blocks with store conditional instructions to clear reservation (which ends up hindering performance) - Move from CR instructions just to compare the result of a store conditional with zero (even though it is a record-form) This patch aims to resolve all of those issues. Differential revision: https://reviews.llvm.org/D134783	2022-10-03 19:55:29 -05:00
Paul Scoropan	ce004fb4f2	[PowerPC] XCOFF exception section support on the direct assembler path This feature implements support for making entries in the exception section on XCOFF on the direct assembly path using the ".except" pseudo-op. It also provides functionality to lower entries (comprised of language and reason codes) into the exception section through the use of annotation metadata attached to llvm.ppc.trap/trapd/tw/tdw intrinsics. Integrated assembler support will be provided in another review. https://reviews.llvm.org/D133030 needs to merge first for LIT tests Reviewed By: shchenz, RKSimon Differential Revision: https://reviews.llvm.org/D132146	2022-09-26 22:24:20 -04:00
Josh Stone	4dcfb09e40	[NFC][CodeGen] Use const MF in TargetLowering stack probe functions This makes them callable from places like canUseAsPrologue. Differential Revision: https://reviews.llvm.org/D134492	2022-09-23 09:30:32 -07:00
Simon Pilgrim	f9de13232f	[X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis. For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling. Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU. Differential Revision: https://reviews.llvm.org/D132520	2022-08-24 17:28:18 +01:00
Masoud Ataei	96515df816	[PowerPC] Fix the check for scalar MASS conversion Proposing to move the check for scalar MASS conversion from constructor of PPCTargetLowering to the lowerLibCallBase function which decides about the lowering. The Target machine option Options.PPCGenScalarMASSEntries is set in PPCTargetMachine.cpp. But an object of the class PPCTargetLowering is created in one of the included header files. So, the constructor will run before setting PPCGenScalarMASSEntries to correct value. So, we cannot check this option in the constructor. Differential: https://reviews.llvm.org/D128653 Reviewer: @bmahjour	2022-07-06 11:44:00 -07:00
Kai Luo	549e118e93	[PowerPC] Support 16-byte lock free atomics on pwr8 and up Make 16-byte atomic type aligned to 16-byte on PPC64, thus consistent with GCC. Also enable inlining 16-byte atomics on non-AIX targets on PPC64. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D122377	2022-04-08 23:25:56 +00:00
Shao-Ce SUN	21ac474392	[NFC] Correct typo `interger` to `integer`	2022-02-17 21:17:47 +08:00
Amy Kwan	ac5a5a9cfe	[PowerPC] Add default handling for single element vectors, and split/promote vNi1 vectors. This patch updates the handling of vectors in getPreferredVectorAction(): For single-element and scalable vectors, fall back to default vector legalization handling. For vNi1 vectors, add handling to either split or promote them in order to prevent the production of wide v256i1/v512i1 types. The following assertion is fixed by this patch, as we ended up producing the wide vector types (that are used for MMA) in the backend prior to this fix. ``` Assertion failed: VT.getSizeInBits() == Operand.getValueSizeInBits() && "Cannot BITCAST between types of different sizes!" ``` Differential Revision: https://reviews.llvm.org/D119521	2022-02-15 08:44:08 -06:00
Ting Wang	097a95f2df	[PowerPC] Add custom lowering for SELECT_CC fp128 using xsmaxcqp Power ISA 3.1 adds xsmaxcqp/xsmincqp for quad-precision type-c max/min selection, and this opens the opportunity to improve instruction selection on: llvm.maxnum.f128, llvm.minnum.f128, and select_cc ordered gt/lt and (don't care) gt/lt. Reviewed By: nemanjai, shchenz, amyk Differential Revision: https://reviews.llvm.org/D117006	2022-02-09 21:48:28 -05:00
Masoud Ataei	256d253332	[PowerPC] Scalar IBM MASS library conversion pass This patch introduces the conversions from math function calls to MASS library calls. To resolves calls generated with these conversions, one need to link libxlopt.a library. This patch is tested on PowerPC Linux and AIX. Differential: https://reviews.llvm.org/D101759 Reviewer: bmahjour	2022-02-02 07:54:19 -08:00
Qiu Chaofan	c2cc70e4f5	[NFC] Fix endif comments to match with include guard	2022-01-07 15:52:59 +08:00
Nemanja Ivanovic	c9cb8edc51	[PowerPC] Allow scalars for asm constraint "v" with VSX Similarly to what GCC does, we should allow scalars with the "v" constraint rather than introducing unnecessary new constraints for scalars in Altivec registers. Differential revision: https://reviews.llvm.org/D113635	2021-11-23 17:03:04 -06:00
Nemanja Ivanovic	5840f7197d	[PowerPC] Respect rounding mode in the back end Currently, the floating point instructions that depend on rounding mode are correctly marked in the PPC back end with an implicit use of the RM register. Similarly, instructions that explicitly define the register are marked with an implicit def of the same register. So for the most part, RM-using code won't be moved across RM-setting instructions. However, calls are not marked as RM-setting instructions so code can be moved across calls. This is generally desired, but so is the ability to turn off this behaviour with an appropriate option - and -frounding-math really should be that option. This patch provides a set of call instructions (for direct and indirect calls) that are marked with an implicit def of the RM register. These will be used for calls that are marked with the strictfp attribute. Differential revision: https://reviews.llvm.org/D111433	2021-11-10 08:19:58 -06:00
Chen Zheng	5a8b196340	[PowerPC] handle more splat loads without stack operation This mostly improves splat loads code generation on Power7 Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D106555	2021-11-03 05:17:41 +00:00
Arthur Eubanks	a0a4935182	Make more places that use alignment use uint64_t Followup to D110451.	2021-10-08 16:35:19 -07:00
Amy Kwan	5041a485b9	[PowerPC] Exploit Prefixed Load/Stores using the refactored Load/Store Implementation This patch exploits the prefixed load and store instructions utilizing the refactored load/store implementation introduced in D93370. Prefixed load and store instructions are emitted whenever we are loading or storing a value with an offset that fits into a 34-bit signed immediate. Patterns for the prefixed load and stores are added in this patch, as well as the implementation that detects when we are loading and storing a value with an offset that fits in 34-bits. Differential Revision: https://reviews.llvm.org/D96075	2021-09-14 08:39:49 -05:00
Amy Kwan	351a0d8a90	[PowerPC] Update PC-Relative Load/Store Patterns to use the refactored Load/Store Implementation This patch updates the PC-Relative load and store patterns to utilize the refactored load/store implementation introduced in D93370. PC-Relative implementation has been added to PPCISelLowering.cpp, and also the patterns in PPCInstrPrefix.td have been updated and no longer require AddedComplexity. All existing test cases pass with this update. Differential Revision: https://reviews.llvm.org/D95116	2021-09-09 15:38:42 -05:00
Kai Luo	5eaebd5d64	[PowerPC] Implement quadword atomic load/store Add support to load/store i128 atomically. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D105612	2021-09-01 06:55:40 +00:00
Kai Luo	b9c3941cd6	[PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D103614	2021-07-15 01:12:09 +00:00
David Spickett	e4ecd83fe9	[llvm][AArch64] Handle arrays of struct properly (from IR) This only applies to FastIsel. GlobalIsel seems to sidestep the issue. This fixes https://bugs.llvm.org/show_bug.cgi?id=46996 One of the things we do in llvm is decide if a type needs consecutive registers. Previously, we just checked if it was an array or not. (plus an SVE specific check that is not changing here) This causes some confusion when you arbitrary IR like: ``` %T1 = type { double, i1 }; define [ 1 x %T1 ] @foo() { entry: ret [ 1 x %T1 ] zeroinitializer } ``` We see it is an array so we call CC_AArch64_Custom_Block which bails out when it sees the i1, a type we don't want to put into a block. This leaves the location of the double in some kind of intermediate state and leads to odd codegen. Which then crashes the backend because it doesn't know how to implement what it's been asked for. You get this: ``` renamable $d0 = FMOVD0 $w0 = COPY killed renamable $d0 ``` Rather than this: ``` $d0 = FMOVD0 $w0 = COPY $wzr ``` The backend knows how to copy 64 bit to 64 bit registers, but not 64 to 32. It can certainly be taught how but the real issue seems to be us even trying to assign a register block in the first place. This change makes the logic of AArch64TargetLowering::functionArgumentNeedsConsecutiveRegisters a bit more in depth. If we find an array, also check that all the nested aggregates in that array have a single member type. Then CC_AArch64_Custom_Block's assumption of a type that looks like [ N x type ] will be valid and we get the expected codegen. New tests have been added to exercise these situations. Note that some of the output is not ABI compliant. The aim of this change is to simply handle these situations and not to make our processing of arbitrary IR ABI compliant. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D104123	2021-06-16 13:56:01 +00:00
Nikita Popov	1ffa6499ea	[TargetLowering] Use IRBuilderBase instead of IRBuilder<> (NFC) Don't require a specific kind of IRBuilder for TargetLowering hooks. This allows us to drop the IRBuilder.h include from TargetLowering.h. Differential Revision: https://reviews.llvm.org/D103759	2021-06-06 16:29:50 +02:00
Anshil Gandhi	1c5ff0b03f	[PowerPC] [GlobalISel] Implementation of formal arguments lowering in the IRTranslator for the PPC backend Differential Revision: https://reviews.llvm.org/D99812	2021-06-02 16:46:39 -06:00
Anshil Gandhi	3e5ddb83e3	Revert "Differential Revision: https://reviews.llvm.org/D99812 " This reverts commit c729f2a48a6ef6b20554494c5630082c89c3680c.	2021-06-02 16:36:00 -06:00
Anshil Gandhi	c729f2a48a	Differential Revision: https://reviews.llvm.org/D99812	2021-06-02 14:09:52 -06:00
Jinsong Ji	b2581196eb	[AIX] Enable stackprotect feature AIX use `__ssp_canary_word` instead of `__stack_chk_guard`. This patch update the target hook to use correct symbol, so that the basic stackprotect feature can work. The traceback will be handled in follow up patch. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D103100	2021-05-28 02:18:15 +00:00

1 2 3 4 5 ...

522 Commits