673 Commits

Author SHA1 Message Date
Michael Kruse
a22236120f [OpenMP] Implement '#pragma omp unroll'.
Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations).

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99459
2021-06-10 14:30:17 -05:00
Ten Tzen
797ad70152 [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1
This patch is the Part-1 (FE Clang) implementation of HW Exception handling.

This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).
This is the first step of this project; only X86_64 target is enabled in this patch.

Compiler options:
For clang-cl.exe, the option is -EHa, the same as MSVC.
For clang.exe, the extra option is -fasync-exceptions,
plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow
three rules:
* First, no exception can move in or out of _try region., i.e., no "potential
  faulty instruction can be moved across _try boundary.
* Second, the order of exceptions for instructions 'directly' under a _try
  must be preserved (not applied to those in callees).
* Finally, global states (local/global/heap variables) that can be read
  outside of _try region must be updated in memory (not just in register)
  before the subsequent exception occurs.

The impact to C++ code:
Although SEH is a feature for C code, -EHa does have a profound effect on C++
side. When a C++ function (in the same compilation unit with option -EHa ) is
called by a SEH C function, a hardware exception occurs in C++ code can also
be handled properly by an upstream SEH _try-handler or a C++ catch(...).
As such, when that happens in the middle of an object's life scope, the dtor
must be invoked the same way as C++ Synchronous Exception during unwinding
process.

Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.

One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing
SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others.  If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:

Warning: jump bypasses variable with a non-trivial destructor

In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.

Implementation:
Part-1: Clang implementation described below.

Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end().
_scope_begin() is immediately added after ctor() is called and EHStack is pushed.
So it must be an invoke, not a call. With that it's also guaranteed an
EH-cleanup-pad is created regardless whether there exists a call in this scope.
_scope_end is added before dtor(). These two intrinsics make the computation of
Block-State possible in downstream code gen pass, even in the presence of
ctor/dtor inlining.

Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark
_try boundary and to prevent from exceptions being moved across _try boundary.
All memory instructions inside a _try are considered as 'volatile' to assure
2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's
acceptable as the amount of code directly under _try is very small.

Part-2 (will be in Part-2 patch): LLVM implementation described below.

For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).

For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.

The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Differential Revision: https://reviews.llvm.org/D80344/new/
2021-05-17 22:42:17 -07:00
cynecx
8ec9fd4839 Support unwinding from inline assembly
I've taken the following steps to add unwinding support from inline assembly:

1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax:

```
invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"()
    to label %exit unwind label %uexit
```

2.) Add Bitcode writing/reading support + LLVM-IR parsing.

3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled.

4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind.

5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled.

6.) Don't allow unwinding callbr.

Reviewed By: Amanieu

Differential Revision: https://reviews.llvm.org/D95745
2021-05-13 19:13:03 +01:00
Florian Hahn
6c31295493
[clang] Refactor mustprogress handling, add it to all loops in c++11+.
Currently Clang does not add mustprogress to inifinite loops with a
known constant condition, matching C11 behavior. The forward progress
guarantee in C++11 and later should allow us to add mustprogress to any
loop (http://eel.is/c++draft/intro.progress#1).

This allows us to simplify the code dealing with adding mustprogress a
bit.

Reviewed By: aaron.ballman, lebedev.ri

Differential Revision: https://reviews.llvm.org/D96418
2021-04-30 14:13:47 +01:00
Joshua Haberman
8344675908 Implemented [[clang::musttail]] attribute for guaranteed tail calls.
This is a Clang-only change and depends on the existing "musttail"
support already implemented in LLVM.

The [[clang::musttail]] attribute goes on a return statement, not
a function definition. There are several constraints that the user
must follow when using [[clang::musttail]], and these constraints
are verified by Sema.

Tail calls are supported on regular function calls, calls through a
function pointer, member function calls, and even pointer to member.

Future work would be to throw a warning if a users tries to pass
a pointer or reference to a local variable through a musttail call.

Reviewed By: rsmith

Differential Revision: https://reviews.llvm.org/D99517
2021-04-15 17:12:21 -07:00
cchen
e0c2125d1d [OpenMP] Added codegen for masked directive
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D100514
2021-04-15 12:55:07 -05:00
cchen
1a43fd2769 [OpenMP51] Initial support for masked directive and filter clause
Adds basic parsing/sema/serialization support for the #pragma omp masked
directive.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D99995
2021-04-09 14:00:36 -05:00
Mike Rice
b7899ba0e8 [OPENMP51]Initial support for the dispatch directive.
Added basic parsing/sema/serialization support for dispatch directive.

Differential Revision: https://reviews.llvm.org/D99537
2021-03-30 14:12:53 -07:00
Roman Lebedev
e3a4701627
[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights
08196e0b2e1f8aaa8a854585335c17ba479114df exposed LowerExpectIntrinsic's
internal implementation detail in the form of
LikelyBranchWeight/UnlikelyBranchWeight options to the outside.

While this isn't incorrect from the results viewpoint,
this is suboptimal from the layering viewpoint,
and causes confusion - should transforms also use those weights,
or should they use something else, D98898?

So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight
internal again, and fixing all the code that used it directly,
which currently is only clang codegen, thankfully,
to emit proper @llvm.expect intrinsics instead.
2021-03-21 22:50:21 +03:00
Benjamin Kramer
19d2c65ddd [CodeGen] Don't crash on for loops with cond variables and no increment
This looks like an oversight from a875721d8a2d, creating IR that refers
to `for.inc` even if it doesn't exist.

Differential Revision: https://reviews.llvm.org/D98980
2021-03-19 20:43:52 +01:00
Richard Smith
a875721d8a PR49585: Emit the jump destination for a for loop 'continue' from within the scope of the condition variable.
The condition variable is in scope in the loop increment, so we need to
emit the jump destination from wthin the scope of the condition
variable.

For GCC compatibility (and compatibility with real-world 'FOR_EACH'
macros), 'continue' is permitted in a statement expression within the
condition of a for loop, though, so there are two cases here:

* If the for loop has no condition variable, we can emit the jump
  destination before emitting the condition.

* If the for loop has a condition variable, we must defer emitting the
  jump destination until after emitting the variable. We diagnose a
  'continue' appearing in the initializer of the condition variable,
  because it would jump past the initializer into the scope of that
  variable.

Reviewed By: rjmccall

Differential Revision: https://reviews.llvm.org/D98816
2021-03-17 16:24:04 -07:00
Mike Rice
410f09af09 [OPENMP51]Initial support for the interop directive.
Added basic parsing/sema/serialization support for interop directive.
Support for the 'init' clause.

Differential Revision: https://reviews.llvm.org/D98558
2021-03-17 09:42:07 -07:00
Michael Kruse
b119120673 [clang][OpenMP] Use OpenMPIRBuilder for workshare loops.
Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to:
 * Only the worksharing-loop directive.
 * Recognizes only the nowait clause.
 * No loop nests with more than one loop.
 * Untested with templates, exceptions.
 * Semantic checking left to the existing infrastructure.

This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics:
 * The distance function: How many loop iterations there will be before entering the loop nest.
 * The loop variable function: Conversion from a logical iteration number to the loop variable.

These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang.

The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does.

For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting.

Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset.

Reviewed By: jdenny

Differential Revision: https://reviews.llvm.org/D94973
2021-03-04 22:52:59 -06:00
Michael Kruse
6c05005238 [OpenMP] Implement '#pragma omp tile', by Michael Kruse (@Meinersbur).
The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard.

This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult.

A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once.

I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest).

Differential Revision: https://reviews.llvm.org/D76342
2021-02-16 09:45:07 -08:00
Freddy Ye
1edb76cc91 [X86] merge "={eax}" and "~{eax}" into "=&eax" for MSInlineASM
Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D94466
2021-01-27 22:54:17 +08:00
Alan Phipps
9f2967bcfe [Coverage] Add support for Branch Coverage in LLVM Source-Based Code Coverage
This is an enhancement to LLVM Source-Based Code Coverage in clang to track how
many times individual branch-generating conditions are taken (evaluate to TRUE)
and not taken (evaluate to FALSE).  Individual conditions may comprise larger
boolean expressions using boolean logical operators.  This functionality is
very similar to what is supported by GCOV except that it is very closely
anchored to the ASTs.

Differential Revision: https://reviews.llvm.org/D84467
2021-01-05 09:51:51 -06:00
Thorsten Schütt
2fd11e0b1e Revert "[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)"
This reverts commit efc82c4ad2bcb256a4f4c20238d08cd3afba4d2d.
2021-01-04 23:17:45 +01:00
Thorsten Schütt
efc82c4ad2 [NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)
Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D93765
2021-01-04 22:58:26 +01:00
Florian Hahn
680931af27
[Matrix] Adjust matrix pointer type for inline asm arguments.
Matrix types in memory are represented as arrays, but accessed through
vector pointers, with the alignment specified on the access operation.

For inline assembly, update pointer arguments to use vector pointers.
Otherwise there will be a mis-match if the matrix is also an
input-argument which is represented as vector.

Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D91631
2020-11-18 11:44:11 +00:00
Atmn Patel
fd3cad7a60 [clang] Fix ForStmt mustprogress handling
D86841 had an error where for statements with no conditional were
required to make progress. This is not true, this patch removes that
line, and adds regression tests.

Differential Revision: https://reviews.llvm.org/D91075
2020-11-09 11:38:06 -05:00
Atmn Patel
ac73b73c16 [clang] Add mustprogress and llvm.loop.mustprogress attribute deduction
Since C++11, the C++ standard has a forward progress guarantee
[intro.progress], so all such functions must have the `mustprogress`
requirement. In addition, from C11 and onwards, loops without a non-zero
constant conditional or no conditional are also required to make
progress (C11 6.8.5p6). This patch implements these attribute deductions
so they can be used by the optimization passes.

Differential Revision: https://reviews.llvm.org/D86841
2020-11-04 22:03:14 -05:00
Mark de Wever
b46fddf75f [CodeGen] Implement [[likely]] and [[unlikely]] for while and for loop.
The attribute has no effect on a do statement since the path of execution
will always include its substatement.

It adds a diagnostic when the attribute is used on an infinite while loop
since the codegen omits the branch here. Since the likelihood attributes
have no effect on a do statement no diagnostic will be issued for
do [[unlikely]] {...} while(0);

Differential Revision: https://reviews.llvm.org/D89899
2020-10-31 17:51:29 +01:00
Jonas Paulsson
42a82862b6 Reapply "[clang] Improve handling of physical registers in inline
assembly operands."

Earlyclobbers are now excepted from this change (original commit: c78da03).

Review: Ulrich Weigand, Nick Desaulniers

Differential Revision: https://reviews.llvm.org/D87279
2020-10-21 10:53:40 +02:00
Mark de Wever
2bcda6bb28 [Sema, CodeGen] Implement [[likely]] and [[unlikely]] in SwitchStmt
This implements the likelihood attribute for the switch statement. Based on the
discussion in D85091 and D86559 it only handles the attribute when placed on
the case labels or the default labels.

It also marks the likelihood attribute as feature complete. There are more QoI
patches in the pipeline.

Differential Revision: https://reviews.llvm.org/D89210
2020-10-18 13:48:42 +02:00
Jonas Paulsson
625fa47617 Revert "[clang] Improve handling of physical registers in inline assembly operands."
This reverts commit c78da037783bda0f27f4d82060149166e6f0c796.

Temporarily reverted due to https://bugs.llvm.org/show_bug.cgi?id=47837.
2020-10-14 08:42:51 +02:00
Jonas Paulsson
c78da03778 [clang] Improve handling of physical registers in inline assembly operands.
Change EmitAsmStmt() to

- Not tie physregs with the "+r" constraint, but instead add the hard
  register as an input constraint. This makes "+r" and "=r":"r" look the same
  in the output.

  Background: Macro intensive user code may contain inline assembly
  statements with multiple operands constrained to the same physreg. Such a
  case (with the operand constraints "+r" : "r") currently triggers the
  TwoAddressInstructionPass assertion against any extra use of a tied
  register. Furthermore, TwoAddress will insert a COPY to that physreg even
  though isel has already done so (for the non-tied use), which may lead to a
  second redundant instruction currently. A simple fix for this is to not
  emit tied physreg uses in the first place for the "+r" constraint, which is
  what this patch does.

- Give an error on multiple outputs to the same physical register.

  This should be reported and this is also what GCC does.

Review: Ulrich Weigand, Aaron Ballman, Jennifer Yu, Craig Topper

Differential Revision: https://reviews.llvm.org/D87279
2020-10-13 15:09:52 +02:00
Mark de Wever
1113fbf44c [CodeGen] Improve likelihood branch weights
Bruno De Fraine discovered some issues with D85091. The branch weights
generated for `logical not` and `ternary conditional` were wrong. The
`logical and` and `logical or` differed from the code generated of
`__builtin_predict`.

Adjusted the generated code for the likelihood to match
`__builtin_predict`. The patch is based on Bruno's suggestions.

Differential Revision: https://reviews.llvm.org/D88363
2020-10-04 14:24:27 +02:00
Mark de Wever
08196e0b2e Implements [[likely]] and [[unlikely]] in IfStmt.
This is the initial part of the implementation of the C++20 likelihood
attributes. It handles the attributes in an if statement.

Differential Revision: https://reviews.llvm.org/D85091
2020-09-09 20:48:37 +02:00
Aaron Puchert
916b750a8d [CodeGen] Use existing EmitLambdaVLACapture (NFC) 2020-08-19 15:20:05 +02:00
Wang, Pengfei
18581fd2c4 [CFE] Add nomerge function attribute to inline assembly.
Sometimes we also want to avoid merging inline assembly. This patch add
the nomerge function attribute to inline assembly.

Reviewed By: zequanwu

Differential Revision: https://reviews.llvm.org/D84225
2020-07-22 08:22:58 +08:00
Tyker
51e4aa87e0 attempt to fix failing buildbots after 3bab88b7baa20b276faaee0aa7ca87f636c91877
Prevent IR-gen from emitting consteval declarations

Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.
2020-06-15 12:58:37 +02:00
Kirill Bobyrev
550c4562d1 Revert "Prevent IR-gen from emitting consteval declarations"
This reverts commit 3bab88b7baa20b276faaee0aa7ca87f636c91877.

This patch causes test failures:
http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/17260
2020-06-15 12:14:15 +02:00
Tyker
3bab88b7ba Prevent IR-gen from emitting consteval declarations
Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.

Reviewers: rsmith

Reviewed By: rsmith

Subscribers: cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D76420
2020-06-15 10:47:14 +02:00
Akira Hatanaka
c9a52de002 [CodeGen] Simplify the way lifetime of block captures is extended
Rather than pushing inactive cleanups for the block captures at the
entry of a full expression and activating them during the creation of
the block literal, just call pushLifetimeExtendedDestroy to ensure the
cleanups are popped at the end of the scope enclosing the block
expression.

rdar://problem/63996471

Differential Revision: https://reviews.llvm.org/D81624
2020-06-11 16:06:22 -07:00
Alexey Bataev
bd1c03d7b7 [OPENMP50]Codegen for inscan reductions in worksharing directives.
Summary:
Implemented codegen for reduction clauses with inscan modifiers in
worksharing constructs.

Emits the code for the directive with inscan reductions.
The code is the following:
```
size num_iters = <num_iters>;
<type> buffer[num_iters];
for (i: 0..<num_iters>) {
  <input phase>;
  buffer[i] = red;
}
for (int k = 0; k != ceil(log2(num_iters)); ++k)
for (size cnt = last_iter; cnt >= pow(2, k); --k)
  buffer[i] op= buffer[i-pow(2,k)];
for (0..<num_iters>) {
  red = InclusiveScan ? buffer[i] : buffer[i-1];
  <scan phase>;
}
```

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, arphaman, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D79948
2020-06-04 16:29:33 -04:00
Alexey Bataev
9ca5a6d3b5 [OPENMP]Fix PR46146: Do not consider globalized variables as NRVO candidates.
Summary:
If the variables must be globalized in OpenMP mode (local automatic
variable, GPU compilation mode, the variable may escape its declaration
context by the reference or by the pointer), it should not be considered
as the NRVO candidate. Otherwise, incorrect the return value of the
function might not be updated.

Reviewers: jdoerfert

Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin

Tags: #clang

Differential Revision: https://reviews.llvm.org/D80936
2020-06-04 12:33:25 -04:00
Zequan Wu
e36076ee3a [clang] Add nomerge function attribute to clang
Differential Revision: https://reviews.llvm.org/D79121
2020-05-21 17:07:39 -07:00
Zequan Wu
b0a0f01bc1 Revert "Add nomerge function attribute to clang"
This reverts commit 307e85395485e1eff9533b2d7952b16f33ceae38.
2020-05-21 16:13:18 -07:00
Zequan Wu
307e853954 Add nomerge function attribute to clang 2020-05-21 15:28:27 -07:00
Florian Hahn
338be9c595 [Clang] Add llvm.loop.unroll.disable to loops with -fno-unroll-loops.
Currently Clang does not respect -fno-unroll-loops during LTO. During
D76916 it was suggested to respect -fno-unroll-loops on a TU basis.

This patch uses the existing llvm.loop.unroll.disable metadata to
disable loop unrolling explicitly for each loop in the TU if
unrolling is disabled. This should ensure that loops from TUs compiled
with -fno-unroll-loops are skipped by the unroller during LTO.

This also means that if a loop from a TU with -fno-unroll-loops
gets inlined into a TU without this option, the loop won't be
unrolled.

Due to the fact that some transforms might drop loop metadata, there
potentially are cases in which we still unroll loops from TUs with
-fno-unroll-loops. I think we should fix those issues rather than
introducing a function attribute to disable loop unrolling during LTO.
Improving the metadata handling will benefit other use cases, like
various loop pragmas, too. And it is an improvement to clang completely
ignoring -fno-unroll-loops during LTO.

If that direction looks good, we can use a similar approach to also
respect -fno-vectorize during LTO, at least for LoopVectorize.

In the future, this might also allow us to remove the UnrollLoops option
LLVM's PassManagerBuilder.

Reviewers: Meinersbur, hfinkel, dexonsmith, tejohnson

Reviewed By: Meinersbur, tejohnson

Differential Revision: https://reviews.llvm.org/D77058
2020-04-07 14:01:55 +01:00
Alexey Bataev
fcba7c3534 [OPENMP50]Initial support for scan directive.
Addedi basic parsing/sema/serialization support for scan directive.
2020-03-20 07:58:15 -04:00
Kerry McLaughlin
af64948e2a [SVE][Inline-Asm] Add constraints for SVE ACLE types
Summary:
Adds the constraints described below to ensure that we
can tie variables of SVE ACLE types to operands in inline-asm:
 - y: SVE registers Z0-Z7
 - Upl: One of the low eight SVE predicate registers (P0-P7)
 - Upa: Full range of SVE predicate registers (P0-P15)

Reviewers: sdesmalen, huntergr, rovka, cameron.mcinally, efriedma, rengolin

Reviewed By: efriedma

Subscribers: miyuki, tschuett, rkruppe, psnobl, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D75690
2020-03-17 11:04:19 +00:00
Alexey Bataev
c112e941a0 [OPENMP50]Add basic support for depobj construct.
Added basic parsing/sema/serialization support for depobj directive.
2020-03-02 13:10:32 -05:00
Reid Kleckner
86565c1309 Avoid SourceManager.h include in RawCommentList.h, add missing incs
SourceManager.h includes FileManager.h, which is expensive due to
dependencies on LLVM FS headers.

Remove dead BeforeThanCompare specialization.

Sink ASTContext::addComment to cpp file.

This reduces the time to compile a file that does nothing but include
ASTContext.h from ~3.4s to ~2.8s for me.

Saves these includes:
    219 -    ../clang/include/clang/Basic/SourceManager.h
    204 -    ../clang/include/clang/Basic/FileSystemOptions.h
    204 -    ../clang/include/clang/Basic/FileManager.h
    165 -    ../llvm/include/llvm/Support/VirtualFileSystem.h
    164 -    ../llvm/include/llvm/Support/SourceMgr.h
    164 -    ../llvm/include/llvm/Support/SMLoc.h
    161 -    ../llvm/include/llvm/Support/Path.h
    141 -    ../llvm/include/llvm/ADT/BitVector.h
    128 -    ../llvm/include/llvm/Support/MemoryBuffer.h
    124 -    ../llvm/include/llvm/Support/FileSystem.h
    124 -    ../llvm/include/llvm/Support/Chrono.h
    124 -    .../MSVCSTL/include/stack
    122 -    ../llvm/include/llvm-c/Types.h
    122 -    ../llvm/include/llvm/Support/NativeFormatting.h
    122 -    ../llvm/include/llvm/Support/FormatProviders.h
    122 -    ../llvm/include/llvm/Support/CBindingWrapping.h
    122 -    .../MSVCSTL/include/xtimec.h
    122 -    .../MSVCSTL/include/ratio
    122 -    .../MSVCSTL/include/chrono
    121 -    ../llvm/include/llvm/Support/FormatVariadicDetails.h
    118 -    ../llvm/include/llvm/Support/MD5.h
    109 -    .../MSVCSTL/include/deque
    105 -    ../llvm/include/llvm/Support/Host.h
    105 -    ../llvm/include/llvm/Support/Endian.h

Reviewed By: aaron.ballman, hans

Differential Revision: https://reviews.llvm.org/D75279
2020-02-27 13:49:40 -08:00
Bill Wendling
50cac24877 Support output constraints on "asm goto"
Summary:
Clang's "asm goto" feature didn't initially support outputs constraints. That
was the same behavior as gcc's implementation. The decision by gcc not to
support outputs was based on a restriction in their IR regarding terminators.
LLVM doesn't restrict terminators from returning values (e.g. 'invoke'), so
it made sense to support this feature.

Output values are valid only on the 'fallthrough' path. If an output value's used
on an indirect branch, then it's 'poisoned'.

In theory, outputs *could* be valid on the 'indirect' paths, but it's very
difficult to guarantee that the original semantics would be retained. E.g.
because indirect labels could be used as data, we wouldn't be able to split
critical edges in situations where two 'callbr' instructions have the same
indirect label, because the indirect branch's destination would no longer be
the same.

Reviewers: jyknight, nickdesaulniers, hfinkel

Reviewed By: jyknight, nickdesaulniers

Subscribers: MaskRay, rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D69876
2020-02-24 18:51:29 -08:00
serge_sans_paille
e67cbac812 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 10:42:45 +01:00
serge-sans-paille
4546211600 Revert "Support -fstack-clash-protection for x86"
This reverts commit 0fd51a4554f5f4f90342f40afd35b077f6d88213.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-win-x-armv7l/builds/4354
2020-02-09 10:06:31 +01:00
serge_sans_paille
0fd51a4554 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with proper LiveIn
declaration, better option handling and more portable testing.

Differential Revision: https://reviews.llvm.org/D68720
2020-02-09 09:35:42 +01:00
serge-sans-paille
658495e6ec Revert "Support -fstack-clash-protection for x86"
This reverts commit e229017732bcf1911210903ee9811033d5588e0d.

Failures:

http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/2604
http://lab.llvm.org:8011/builders/llvm-clang-win-x-aarch64/builds/4308
2020-02-08 14:26:22 +01:00
serge_sans_paille
e229017732 Support -fstack-clash-protection for x86
Implement protection against the stack clash attack [0] through inline stack
probing.

Probe stack allocation every PAGE_SIZE during frame lowering or dynamic
allocation to make sure the page guard, if any, is touched when touching the
stack, in a similar manner to GCC[1].

This extends the existing `probe-stack' mechanism with a special value `inline-asm'.
Technically the former uses function call before stack allocation while this
patch provides inlined stack probes and chunk allocation.

Only implemented for x86.

[0] https://www.qualys.com/2017/06/19/stack-clash/stack-clash.txt
[1] https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00556.html

This a recommit of 39f50da2a357a8f685b3540246c5d762734e035f with better option
handling and more portable testing

Differential Revision: https://reviews.llvm.org/D68720
2020-02-08 13:31:52 +01:00