224 Commits

Author SHA1 Message Date
Sean Fertile
2d80505401
[AIX] Support per global code model. (#79202)
Exploit the per global code model attribute on AIX. On AIX we need to
update both the code sequence used to access the global (either 1 or 2
instructions for small and large code model respectively) and the
storage mapping class that we emit the toc entry.

---------

Co-authored-by: Amy Kwan <akwan0907@gmail.com>
2024-03-15 12:52:04 -04:00
Zaara Syeda
00ba2a6f1e
[AIX][TOC] Fix buildbot failures from commit b4ae8df (#85303)
The following tests fail when built with Address
and Undefined sanitizers:
CodeGen/PowerPC/basic-toc-data-def.ll
CodeGen/PowerPC/toc-data-large-array2.ll

Subtarget may be null in emitGlobalVariable, for example in the testcase
where we have no functions in the IR. The fix moves this function from
PPCSubtarget to a static helper function. This only fails with
sanitizers because the Subtarget is not used in the member function.
2024-03-14 16:26:58 -04:00
Zaara Syeda
37b5eb0a0a
[AIX][TOC] Add -mtocdata/-mno-tocdata options on AIX (#67999)
This patch enables support that the XL compiler had for AIX under
-qdatalocal/-qdataimported.
2024-03-13 10:26:31 -04:00
Amy Kwan
598cccea80 [AIX][TLS] Generate optimized local-exec access code sequence using X-Form loads/stores
This patch is a follow up to D149722, D152669 and D153645, where a slightly more
optimized code sequence is generated for 64-bit and 32-bit local-exec accesses
when optimizations are turned on.

Handling is added PPCISelDAGToDAG.cpp in order to check if any D-form loads or
stores that follow an PPCISD::ADD_TLS can be optimized to use an X-Form load or
store. In this particular situation, this allows the ADD_TLS node to be removed
completely.

Differential Revision: https://reviews.llvm.org/D150367
2023-07-06 07:57:05 -05:00
Kai Luo
96aaebd12e [MachineCopyPropagation] Eliminate spillage copies that might be caused by eviction chain
Remove spill-reload like copy chains. For example
```
r0 = COPY r1
r1 = COPY r2
r2 = COPY r3
r3 = COPY r4
<def-use r4>
r4 = COPY r3
r3 = COPY r2
r2 = COPY r1
r1 = COPY r0
```
will be folded into
```
r0 = COPY r1
r1 = COPY r4
<def-use r4>
r4 = COPY r1
r1 = COPY r0
```

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D122118
2023-02-08 03:34:25 +00:00
Archibald Elliott
62c7f035b4 [NFC][TargetParser] Remove llvm/ADT/Triple.h
I also ran `git clang-format` to get the headers in the right order for
the new location, which has changed the order of other headers in two
files.
2023-02-07 12:39:46 +00:00
Kai Nacke
70a5d8e4c4 [PPC] Add support for tune-cpu attribute
clang (like gcc) has the -mtune= command line option. This option
adds the "tune-cpu" attribute to a function. The intended functionality
is that the scheduling model of that cpu is used. E.g. -mtune=pwr9 -march=pwr8
generates only instructions supported on pwr8 but uses the scheduling model
of pwr9 for it.
This PR adds the infrastructure to support this in LLVM.
clang support was added in https://reviews.llvm.org/D130526.

Reviewed By: amyk, qiucf

Differential Revision: https://reviews.llvm.org/D138317
2023-01-06 18:01:48 +00:00
Kai Nacke
5ebdd838fb [PowerPC] Simplify PPCSubtarget
The flags, initialization of the flags, and the getter methods for
features defined in PPC.td can be generated by TableGen.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D140028
2022-12-15 15:09:30 +00:00
Chen Zheng
454758ab69 [PowerPC] add a new subtarget feature fastMFLR
Some PowerPC CPU may have slow MFLR instruction, so we need to
schedule the MFLR and its store in function prologue away to
hidden the long latency for slow MFLR instruction.

This patch adds a new feature fastMFLR and the new feature will
be used in https://reviews.llvm.org/D137423.

Reviewed By: RolandF

Differential Revision: https://reviews.llvm.org/D137612
2022-11-10 00:07:47 -05:00
Stefan Pintilie
610eb39c68 [PowerPC][Future] Add an ISA Future to go with mcpu=future.
On Power PC we have ISA3.0 for Power 9, ISA3.1 for Power 10.
This patchs adds an ISA for mcpu=future. The idea is to have a placeholder ISA
for work that is experimental and may not be supported by existing ISAs.

Reviewed By: lei

Differential Revision: https://reviews.llvm.org/D126075
2022-05-26 09:19:58 -05:00
Mircea Trofin
cb2160760e [nfc][codegen] Move RegisterBank[Info].h under CodeGen
This wraps up from D119053. The 2 headers are moved as described,
fixed file headers and include guards, updated all files where the old
paths were detected (simple grep through the repo), and `clang-format`-ed it all.

Differential Revision: https://reviews.llvm.org/D119876
2022-03-01 21:53:25 -08:00
Qiu Chaofan
e3c2694da9 [PowerPC] Implement general back2back fusion
Implement 'back-to-back' FX fusion according to Power10 User Manual
'19.1.5.4 Fusion', not enabled by default.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D114345
2021-12-06 10:15:05 +08:00
Qiu Chaofan
59f4b3d308 [PowerPC] Implement more fusion types for Power10
This implements the rest of Power10 instruction fusion pairs, according
to user manual, including 'wide immediate', 'load compare', 'zero move'
and 'SHA3 assist'.

Only 'SHA3 assist' is enabled by default.

Reviewed By: shchenz

Differential Revision: https://reviews.llvm.org/D112912
2021-11-23 17:21:17 +08:00
Qiu Chaofan
9b5e2b5261 [PowerPC] Implement basic macro fusion in Power10
Including basic fusion types around arithmetic and logical instructions.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D111693
2021-11-08 17:23:56 +08:00
Stefan Pintilie
fb4e44c4e7 [PowerPC] The builtins load8r and store8r are Power 7 plus.
This patch makes sure that the builtins __builtin_ppc_load8r and
__ builtin_ppc_store8r are only available for Power 7 and up.
Currently the builtins seem to produce incorrect code if used for
Power 6 or before.

Reviewed By: nemanjai, #powerpc

Differential Revision: https://reviews.llvm.org/D110653
2021-09-29 14:34:40 -05:00
Kai Luo
b9c3941cd6 [PowerPC] Generate inlined quadword lock free atomic operations via AtomicExpand
This patch uses AtomicExpandPass to implement quadword lock free atomic operations. It adopts the method introduced in https://reviews.llvm.org/D47882, which expand atomic operations post RA to avoid spilling that might prevent LL/SC progress.

Reviewed By: jsji

Differential Revision: https://reviews.llvm.org/D103614
2021-07-15 01:12:09 +00:00
Victor Huang
781929b423 [PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.

Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>

Differential revision: https://reviews.llvm.org/D105501
2021-07-13 13:13:34 -05:00
Victor Huang
e4585d3f4e Revert "[PowerPC][NFC] Power ISA features for Semachecking"
This reverts commit 10e0cdfc6526578c8892d895c0448e77cb9ba876.
2021-07-13 13:13:34 -05:00
Victor Huang
10e0cdfc65 [PowerPC][NFC] Power ISA features for Semachecking
[NFC] This patch adds features for pwr7, pwr8, and pwr9 that can be
used for semachecking builtin functions that are only valid for certain
versions of ppc.

Reviewed By: nemanjai, #powerpc
Authored By: Quinn Pham <Quinn.Pham@ibm.com>

Differential revision: https://reviews.llvm.org/D105501
2021-07-13 10:51:25 -05:00
Kai Luo
c063946476 [AIX] Adjust CSR order to avoid breaking ABI regarding traceback
Allocate non-volatile registers in order to be compatible with ABI, regarding gpr_save.

Quoted from https://www.ibm.com/docs/en/ssw_aix_72/assembler/assembler_pdf.pdf page55,
> The preferred method of using GPRs is to use the volatile registers first. Next, use the nonvolatile registers
> in descending order, starting with GPR31.

This patch is based on @jsji 's initial draft.

Tested on test-suite and SPEC, found no degradation.

Reviewed By: jsji, ZarkoCA, xingxue

Differential Revision: https://reviews.llvm.org/D100167
2021-07-03 04:45:26 +00:00
Stefan Pintilie
91f4c11133 [PowerPC] Add mprivileged option
Add an option to tell the compiler that it can use privileged instructions.

This patch only adds the option. Backend implementation will be added in a
future patch.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D99193
2021-03-24 08:33:22 -05:00
Stefan Pintilie
0e4f5f3ea6 [PowerPC] Change option to mrop-protect
In order to have the same option on power PC LLVM and power PC gcc
the option will be changed from -mrop-protection to -mrop-protect.

The feature will be off by default and turned on when the option is used.

Reviewed By: lei, amyk

Differential Revision: https://reviews.llvm.org/D99185
2021-03-24 05:51:35 -05:00
Stefan Pintilie
b80357d46e [PowerPC] Add option for ROP Protection
Added -mrop-protection for Power PC to turn on codegen that provides some
protection from ROP attacks.

The option is off by default and can be turned on for Power 8, Power 9 and
Power 10.

This patch is for the option only. The feature will be implemented by a later
patch.

Reviewed By: amyk

Differential Revision: https://reviews.llvm.org/D96512
2021-02-18 12:15:50 +00:00
Jinsong Ji
0f588ac03e [PowerPC] Only use some extend mne if assembler is modern enough
Legacy AIX assembly might not support all extended mnes,
add one feature bit to control the generation in MC,
and avoid generating them by default on AIX.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D94458
2021-01-14 20:36:10 +00:00
Nemanja Ivanovic
3f7b4ce960 [PowerPC] Add support for embedded devices with EFPU2
PowerPC cores like e200z759n3 [1] using an efpu2 only support single precision
hardware floating point instructions. The single precision instructions efs*
and evfs* are identical to the spe float instructions while efd* and evfd*
instructions trigger a not implemented exception.

This patch introduces a new command line option -mefpu2 which leads to
single-hardware / double-software code generation.

[1] Core reference:
  https://www.nxp.com/files-static/32bit/doc/ref_manual/e200z759CRM.pdf

Differential revision: https://reviews.llvm.org/D92935
2021-01-12 09:47:00 -06:00
Fangrui Song
bfa6ca07a8 [PowerPC] Delete remnant Darwin ISelLowering code 2021-01-06 21:40:40 -08:00
Qiu Chaofan
bec81dc67d Reland "[PowerPC] Implement instruction clustering for stores"
Commit 3c0b3250 introduced store fusion for PowerPC target, but it
brought failure under UB sanitizer and was reverted. This patch fixes
them.
2020-09-13 19:51:01 +08:00
Kit Barton
009cd4e491 [PPC][GlobalISel] Add initial GlobalIsel infrastructure
This adds the initial GlobalISel skeleton for PowerPC. It can only run
ir-translator and legalizer for `ret void`.

This is largely based on the initial GlobalISel patch for RISCV
(https://reviews.llvm.org/D65219).

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D83100
2020-09-10 11:58:01 -05:00
Qiu Chaofan
8d9c13f37d Revert "[PowerPC] Implement instruction clustering for stores"
This reverts commit 3c0b3250230b3847a2a47dfeacfdb794c2285f02, (along
with ea795304 and bb39eb9e) since it breaks test with UB sanitizer.
2020-09-08 17:24:08 +08:00
Qiu Chaofan
3c0b325023 [PowerPC] Implement instruction clustering for stores
On Power10, it's profitable to schedule some stores with adjacent target
address together. This patch implements this feature.

Reviewed By: steven.zhang

Differential Revision: https://reviews.llvm.org/D86754
2020-09-08 11:03:09 +08:00
Baptiste Saleil
512e256c0d [PowerPC] Add clang options to control MMA support
This patch adds frontend and backend options to enable and disable
the PowerPC MMA operations added in ISA 3.1. Instructions using these
options will be added in subsequent patches.

Differential Revision: https://reviews.llvm.org/D81442
2020-08-24 09:35:55 -05:00
Craig Topper
c7a0b2684f [X86][MC][Target] Initial backend support a tune CPU to support -mtune
This patch implements initial backend support for a -mtune CPU controlled by a "tune-cpu" function attribute. If the attribute is not present X86 will use the resolved CPU from target-cpu attribute or command line.

This patch adds MC layer support a tune CPU. Each CPU now has two sets of features stored in their GenSubtargetInfo.inc tables . These features lists are passed separately to the Processor and ProcessorModel classes in tablegen. The tune list defaults to an empty list to avoid changes to non-X86. This annoyingly increases the size of static tables on all target as we now store 24 more bytes per CPU. I haven't quantified the overall impact, but I can if we're concerned.

One new test is added to X86 to show a few tuning features with mismatched tune-cpu and target-cpu/target-feature attributes to demonstrate independent control. Another new test is added to demonstrate that the scheduler model follows the tune CPU.

I have not added a -mtune to llc/opt or MC layer command line yet. With no attributes we'll just use the -mcpu for both. MC layer tools will always follow the normal CPU for tuning.

Differential Revision: https://reviews.llvm.org/D85165
2020-08-14 15:31:50 -07:00
Baptiste Saleil
7aaa85627b [PowerPC] Add options to control paired vector memops support
Adds frontend and backend options to enable and disable the
PowerPC paired vector memory operations added in ISA 3.1.
Instructions using these options will be added in subsequent patches.

Differential Revision: https://reviews.llvm.org/D83722
2020-07-29 14:00:53 -05:00
Jinsong Ji
d28f86723f Re-land "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit bf544fa1c3cb80f24d85e84559fb11193846259f.

Fixed the typo in PPCInstrInfo.cpp.
2020-07-28 14:00:11 +00:00
Jinsong Ji
bf544fa1c3 Revert "[PowerPC] Remove QPX/A2Q BGQ/BGP CNK support"
This reverts commit adffce71538e219aab4eeb024819baa7687262ff.

This is breaking test-suite, revert while investigation.
2020-07-27 21:07:00 +00:00
Jinsong Ji
adffce7153 [PowerPC] Remove QPX/A2Q BGQ/BGP CNK support
Per RFC http://lists.llvm.org/pipermail/llvm-dev/2020-April/141295.html
no one is making use of QPX/A2Q/BGQ/BGP CNK anymore.

This patch remove the support of QPX/A2Q in llvm, BGQ/BGP in clang,
CNK support in openmp/polly.

Reviewed By: hfinkel

Differential Revision: https://reviews.llvm.org/D83915
2020-07-27 19:24:39 +00:00
Ahsan Saghir
37e72f47a4 [PowerPC] Add -m[no-]power10-vector clang and llvm option
Summary: This patch adds command line option for enabling power10-vector support.

Reviewers: hfinkel, nemanjai, lei, amyk, #powerpc

Reviewed By: lei, amyk, #powerpc

Subscribers: wuzish, kbarton, hiraditya, shchenz, cfe-commits, llvm-commits

Tags: #llvm, #clang, #powerpc

Differential Revision: https://reviews.llvm.org/D80758
2020-06-16 14:47:35 -05:00
Lei Huang
2368bf52cd [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-27 13:14:25 -05:00
Lei Huang
559845f8fe Revert "[PowerPC] Add support for -mcpu=pwr10 in both clang and llvm"
This reverts commit 7eb666b1556b86503f2f386bf921186cdbb2d22a.
2020-05-27 09:40:21 -05:00
Lei Huang
7eb666b155 [PowerPC] Add support for -mcpu=pwr10 in both clang and llvm
Summary:
This patch simply adds support for the new CPU in anticipation of
Power10. There isn't really any functionality added so there are no
associated test cases at this time.

Reviewers: stefanp, nemanjai, amyk, hfinkel, power-llvm-team, #powerpc

Reviewed By: stefanp, nemanjai, amyk, #powerpc

Subscribers: NeHuang, steven.zhang, hiraditya, llvm-commits, wuzish, shchenz, cfe-commits, kbarton, echristo

Tags: #clang, #powerpc, #llvm

Differential Revision: https://reviews.llvm.org/D80020
2020-05-26 13:48:22 -05:00
Kang Zhang
dcc5ff3bc2 [PowerPC] Use PredictableSelectIsExpensive to enable select to branch in CGP
Summary:
This patch will set the variable PredictableSelectIsExpensive to do the
select to if based on BranchProbability in CodeGenPrepare.

When the BranchProbability more than MinPercentageForPredictableBranch,
PPC will convert SELECT to branch.

Reviewed By: nemanjai

Differential Revision: https://reviews.llvm.org/D71883
2020-05-11 15:02:09 +00:00
Simon Pilgrim
d8a4a99161 [PowerPC] Remove unused forward declarations. NFC. 2020-04-23 15:02:18 +01:00
Stefan Pintilie
6c4b40def7 [PowerPC][Future] Add Support For Functions That Do Not Use A TOC.
On PowerPC most functions require a valid TOC pointer.

This is the case because either the function itself needs to use this
pointer to access the TOC or because other functions that are called
from that function expect a valid TOC pointer in the register R2.
The main exception to this is leaf functions that do not access the TOC
since they are guaranteed not to need a valid TOC pointer.

This patch introduces a feature that will allow more functions to not
require a valid TOC pointer in R2.

Differential Revision: https://reviews.llvm.org/D73664
2020-04-08 08:07:35 -05:00
QingShan Zhang
518292dbdf [PowerPC] Add the MacroFusion support for Power8
This patch is intend to implement the missing P8 MacroFusion for LLVM
according to Power8 User's Manual Section 10.1.12 Instruction Fusion

Differential Revision: https://reviews.llvm.org/D70651
2020-03-12 05:15:41 +00:00
Xiangling Liao
660b0d7f7b [AIX] Enable frame pointer for AIX and add related test suite
This patch:
   - enable frame pointer for AIX;
   - update some of red zone comments;
   - add/update testcases;

Differential Revision: https://reviews.llvm.org/D72454
2020-02-10 15:43:41 -05:00
Victor Huang
4b414d9ade [PowerPC][Future] Add pld and pstd to future CPU
Add the prefixed instructions pld and pstd to future CPU. These are load and
store instructions that require new operand types that are 34 bits. This patch
adds the two instructions as well as the operand types required.

Note that this patch also makes a minor change to tablegen to account for the
fact that some instructions are going to require shifts greater than 31 bits
for the new 34 bit instructions.

Differential Revision: https://reviews.llvm.org/D72574
2020-01-28 08:23:29 -06:00
Victor Huang
5cee34013c [PowerPC][Future] Add prefixed instruction paddi to future CPU
Future CPU will include support for prefixed instructions.
These prefixed instructions are formed by a 4 byte prefix
immediately followed by a 4 byte instruction effectively
making an 8 byte instruction. The new instruction paddi
is a prefixed form of addi.

This patch adds paddi and all of the support required
for that instruction. The majority of the patch deals with
supporting the new prefixed instructions. The addition of
paddi is mainly to allow for testing.

Differential Revision: https://reviews.llvm.org/D72569
2020-01-24 07:27:25 -06:00
Fangrui Song
8e1f0974c2 [PowerPC] Delete PPCSubtarget::isDarwin and isDarwinABI
http://lists.llvm.org/pipermail/llvm-dev/2018-August/125614.html developers have agreed to remove Darwin support from POWER backends.

Reviewed By: sfertile

Differential Revision: https://reviews.llvm.org/D72067
2020-01-21 09:54:44 -08:00
Nemanja Ivanovic
a9ad65a2b3 [PowerPC] Change default for unaligned FP access for older subtargets
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=40554

Some CPU's trap to the kernel on unaligned floating point access and there are
kernels that do not handle the interrupt. The program then fails with a SIGBUS
according to the PR. This just switches the default for unaligned access to only
allow it on recent server CPUs that are known to allow this.

Differential revision: https://reviews.llvm.org/D71954
2019-12-28 11:20:52 -06:00
Sean Fertile
93faa237da [PowerPC] Add Support for indirect calls on AIX.
Extends the desciptor-based indirect call support for 32-bit codegen,
and enables indirect calls for AIX.

In-depth Description:
In a function descriptor based ABI, a function pointer points at a
descriptor structure as opposed to the function's entry point. The
descriptor takes the form of 3 pointers: 1 for the function's entry
point, 1 for the TOC anchor of the module containing the function
definition, and 1 for the environment pointer:

struct FunctionDescriptor {
  void *EntryPoint;
  void *TOCAnchor;
  void *EnvironmentPointer;
};

An indirect call has several steps of loading the the information from
the descriptor into the proper registers for setting up the call. Namely
it has to:

1) Save the caller's TOC pointer into the TOC save slot in the linkage
   area, and then load the callee's TOC pointer into the TOC register
   (GPR 2 on AIX).

2) Load the function descriptor's entry point into the count register.

3) Load the environment pointer into the environment pointer register
   (GPR 11 on AIX).

4) Perform the call by branching on count register.

5) Restore the caller's TOC pointer after returning from the indirect call.

A couple important caveats to the above:

- There is no way to directly load a value from memory into the count register.
  Instead we populate the count register by loading the entry point address into
  a gpr and then moving the gpr to the count register.

- The TOC restore has to come immediately after the branch on count register
  instruction (i.e., the 1st instruction executed after we return from the
  call). This is an implementation limitation. We could, in theory, schedule
  the restore elsewhere as long as no uses of the TOC pointer fall in between
  the call and the restore; however, to keep it simple, we insert a pseudo
  instruction that represents both the indirect branch instruction and the
  load instruction that restores the caller's TOC from the linkage area. As
  they flow through the compiler as a single pseudo instruction, nothing can be
  inserted between them and the caller's TOC is then valid at any use.

Differtential Revision: https://reviews.llvm.org/D70724
2019-12-13 20:07:00 -05:00