Add FIR generation and LLVMIR translation support for mergeable clause
on task construct. If mergeable clause is present on a task, the
relevant flag in `ompt_task_flag_t` is set and passed to
`__kmpc_omp_task_alloc`.
fir.call side effects are hard to describe in a useful way using
`MemoryEffectOpInterface` because it is impossible to list which memory
location a user procedure read/write without doing a data flow analysis
of its body (even PURE procedures may read from any module variable,
Fortran SIMPLE procedure from F2023 will allow that, but they are far
from common at that point).
Fortran language specifications allow the compiler to deduce
that a procedure call cannot access a variable in many cases
This patch leverages this to extend `fir::AliasAnalysis::getModRef` to
deal with fir.call.
This will allow implementing "array = array_function()" optimization in
a future patch.
This patch adds a flag to mark hlfir.declare of host variables that are
captured in some internal procedure.
It enables implementing a simple fir.call handling in
fir::AliasAnalysis::getModRef leveraging Fortran language specifications
and without a data flow analysis.
This will allow implementing an optimization for "array =
array_function()" where array storage is passed directly into the hidden
result argument to "array_function" when it can be proven that
arraY_function does not reference "array".
Captured host variables are very tricky because they may be accessed
indirectly in any calls if the internal procedure address was captured
via some global procedure pointer. Without flagging them, there is no
way around doing a complex inter procedural data flow analysis:
- checking that the call is not made to an internal procedure is not
enough because of the possibility of indirect calls made to internal
procedures inside the callee.
- checking that the current func.func has no internal procedure is not
enough because this would be invalid with inlining when an procedure
with internal procedures is inlined inside a procedure without internal
procedure.
- Update the runtime entry points to accept a stream information
- Update the conversion of `cuf.allocate` to pass correctly the stream
information when present.
Note that the stream is not currently used in the runtime. This will be
done in a separate patch as a design/solution needs to be down together
with the allocators.
The conditions in a test did not match the target that was being
requested. This resulted in a test failure when building with
-DTARGETS_TO_BUILD=X86. This is now fixed.
This patch adds support for BIND(C) derived types as return values
matching the AArch64 Procedure Call Standard for C.
Support for BIND(C) derived types as value parameters will be in a
separate patch.
This removes the specialized parsers and helper classes for these
clauses, namely ConcatSeparated, MapModifiers, and MotionModifiers. Map
and the motion clauses are now handled in the same way as all other
clauses with modifiers, with one exception: the commas separating their
modifiers are optional. This syntax is deprecated in OpenMP 5.2.
Implement version checks for modifiers: for a given modifier on a given
clause, check if that modifier is allowed on this clause in the
specified OpenMP version. This replaced several individual checks.
Add a testcase for handling map modifiers in a different order, and for
diagnosing an ultimate modifier out of position.
This PR adds support seq_cst (sequential consistency) clause for the
flush directive in OpenMP. The seq_cst clause enforces a stricter memory
ordering, ensuring that all threads observe the memory effects of the
flush in the same order, improving consistency in memory operations
across threads.
---------
Co-authored-by: Shashwathi N <nshashwa@pe28vega.hpc.amslabs.hpecorp.net>
Co-authored-by: CHANDRA GHALE <chandra.nitdgp@gmail.com>
Two PRs were merged at the same time: one that modified `maybeApplyToV`
function, and shortly afterwards, this (the reverted) one that had the
old definition.
During the merge both definitions were retained leading to compilation
errors.
Reapply the reverted PR (1a08b15589) with the duplicate removed.
A recent commit (23d7a6cedb519853508) introduced a dependency on
libLLVMMC.so. This is to handle the `-print-supported-cpus` option which
uses `llvm/MC/SubtargetInfo`. It requires libLLVMMC to be linked into
the flang-driver which the previous commit did not do. This fixes that
issue.
The aliases are -mcpu=help and -mtune=help. There is still an issue with
the output which prints an example line that references clang. That is
not fixed here because it is printed in llvm/MC/SubtargetInfo.cpp. Some
more thought is needed to determine how best to handle this.
Fixes#117010
Issue semantic warning for any combination of nested OMP TARGET
directives inside another OMP TARGET region.
This change would not affect OMP TARGET inside an OMP TARGET DATA.
However, it issues warning for OMP TARGET DATA inside an OMP TARGET
region.
This actually simplifies the AST node for the schedule clause: the two
allowed modifiers can be easily classified as the ordering-modifier and
the chunk-modifier during parsing without the need to create additional
classes.
Also, define helper macros in parse-tree.h.
Apply the new modifier representation to the DEFAULTMAP and REDUCTION
clauses, with testcases utilizing the new modifier validation.
OpenMP modifier overhaul: #3/3
The character 'e' or 'd' (either case) shouldn't be tokenized as part of
a real literal during preprocessing if it is not followed by an
optionally-signed digit string. Doing so prevents it from being
recognized as a macro name, or as the start of one.
Fixes https://github.com/llvm/llvm-project/issues/115676.
When a fixed form source line begins with the name of a macro, don't
emit the usual warning message about a non-decimal character in the
label field. (The check for a macro was only being applied to free form
source lines, and the label field checking was unconditional).
Fixes https://github.com/llvm/llvm-project/issues/116914.
When there's an error in a DO statement loop control, error recovery
isn't great. A bare "DO" is a valid statement, so a failure to parse its
loop control doesn't fail on the whole statement. Its partial parse ends
after the keyword, and as some other statement parsers can get further
into the input before failing, errors in the loop control can lead to
confusing error messages about bad pointer assignment statements and
others. So just check that a bare "DO" is followed by the end of the
statement.
In the runtime's implementation of floating-point SUM, the
implementation of Kahan's algorithm for increased precision is
incorrect. The running correction factor should be subtracted from each
new data item, not added to it. This fix ensures that the sum of 100M
random default real values between 0. and 1. is close to 5.E7.
See https://en.wikipedia.org/wiki/Kahan_summation_algorithm.
This patch adds parallelization support for the following expression in OpenMP
workshare constructs:
* Elemental procedures in array expressions
(reapplied with linking fix)
This patch fixes:
flang/include/flang/Semantics/openmp-modifiers.h:45:69: error: extra
';' outside of a function is incompatible with C++98
[-Werror,-Wc++98-compat-extra-semi]
The main issue to solve is that OpenMP modifiers can be specified in any
order, so the parser cannot expect any specific modifier at a given
position. To solve that, define modifier to be a union of all allowable
specific modifiers for a given clause.
Additionally, implement modifier descriptors: for each modifier the
corresponding descriptor contains a set of properties of the modifier
that allow a common set of semantic checks. Start with the syntactic
properties defined in the spec: Required, Unique, Exclusive, Ultimate,
and implement common checks to verify each of them.
OpenMP modifier overhaul: #2/3
This is the first part of the effort to make parsing of clause modifiers
more uniform and robust. Currently, when multiple modifiers are allowed,
the parser will expect them to appear in a hard-coded order.
Additionally, modifier properties (such as "ultimate") are checked
separately for each case.
The overall plan is
1. Extract all modifiers into their own top-level classes, and then
equip them with sets of common properties that will allow performing the
property checks generically, without refering to the specific kind of
the modifier.
2. Define a parser (as a separate class) for each modifier.
3. For each clause define a union (std::variant) of all allowable
modifiers, and parse the modifiers as a list of these unions.
The intent is also to isolate parts of the code that could eventually be
auto-generated.
OpenMP modifier overhaul: #1/3
For the situation where scope is implemented to 5.1 standard, check that
the 5.2 are still "not yet implemented" (or some other partial
implementation).
This prepares for using the DECLARE MAPPER construct.
A check in lowering will say "Not implemented" when trying to use a
mapper as some code is required to tie the mapper to the declared one.
Senantics check for the symbol generated.
The interfaces of separate module procedures are sufficiently well
defined in a submodule to be used in a local generic interface; the
compiler just needed to work a little harder to find them.
Fixes https://github.com/llvm/llvm-project/issues/116567.
A procedure pointer is allowed to be initialized with the subprogram in
which it is local, assuming that other requirements are satisfied.
Add a good test for local procedure pointer initialization, as no test
existed for the error message in question.
Fixes https://github.com/llvm/llvm-project/issues/116566.
When the src of the data transfer is a constant, it needs to be
materialized in memory to be able to perform a data transfer.
```
subroutine sub1()
real, device :: a(10)
integer :: I
do i = 5, 10
a(i) = -4.0
end do
end
```
Update the implicit global detection by looking for them in the CUF
kernel and also update to a walk so nested `fir.address_of` in nested
statement are also accounted for.
In loongarch64 LP64D ABI, `unsigned 32-bit` types, such as unsigned int,
are stored in general-purpose registers as proper sign extensions of
their 32-bit values. Therefore, Flang also follows it if a function
needs to be interoperable with C.
Reference:
https://github.com/loongson/la-abi-specs/blob/release/lapcs.adoc#Fundamental-types
Add a new pass that lowers an `omp.workshare` with its binding `omp.workshare.loop_wrapper` loop nests into other OpenMP constructs that can be lowered to LLVM.
More specifically, in order to preserve the sequential execution semantics of the code contained, it wraps portions that needs to be executed on a single thread in `omp.single` blocks, converts code that must be parallelized into `omp.wsloop` nests and inserts the appropriate synchronization.
This alternative loop nest generation is used to generate an OpenMP loop nest instead of fir loops to facilitate parallelizing statements in an OpenMP `workshare` construct.