For btf_type_tag implementation, in order to have the same results with
clang (__attribute__((btf_type_tag("...")))), gcc intends to use c2x
syntax '[[...]]'. Clang also supports similar c2x syntax. Currently, the
clang selftest contains the following five tests:
```
attr-btf_type_tag-func.c
attr-btf_type_tag-similar-type.c
attr-btf_type_tag-var.c
attr-btf_type_tag-func-ptr.c
attr-btf_type_tag-typedef-field.c
```
Tests attr-btf_type_tag-func.c and attr-btf_type_tag-var.c already have
c2x syntax test.
Test attr-btf_type_tag-func-ptr.c does not support c2x syntax when
'__attribute__((...))' is replaced with with '[[...]]'. This should not
be an issue since we do not have use cases for function pointer yet.
This patch added '[[...]]' syntax for
```
attr-btf_type_tag-similar-type.c
attr-btf_type_tag-typedef-field.c
```
Also enable half-precision variants of tgamma, which were previously
missing.
Note that unlike recent work, these builtins are not vectorized as part
of this commit. Ultimately all three call into lgamma_r, which has heavy
control flow (including switch statements) that would be difficult to
vectorize. Additionally the lgamma_r algorithm is copyrighted to SunPro
so may need a rewrite in the future anyway.
There are no codegen changes (to non-SPIR-V targets) with this commit,
aside from the new half builtins.
Like other target statements, the statement associated with the label in
a legacy ASSIGN statement could be inside a construct. Constructs
containing such a target must therefore be marked as unstructured,
fairly similar to how targets are processed in `markBranchTarget`.
We only use `_LIBCPP_DISABLE_EXTENSION_WARNINGS` in a single place while
we use extensions all over the place. The warnings are already disabled,
since libc++'s headers are system headers, so this shouldn't be in any
way observable by users.
There are cases in SPIR-V shaders where values need to be yielded from
the selection region to make valid MLIR. For example (part of the SPIR-V
shader decompiled to GLSL):
```
bool _115
if (_107)
{
// ...
float _200 = fma(...);
// ...
_115 = _200 < _174;
}
else
{
_115 = _107;
}
bool _123;
if (_115)
{
// ...
float _213 = fma(...);
// ...
_123 = _213 < _174;
}
else
{
_123 = _115;
}
````
This patch extends `mlir.selection` so it can return values.
`mlir.merge` is used as a "yield" operation. This allows to maintain a
compatibility with code that does not yield any values, as well as, to
maintain an assumption that `mlir.merge` is the only operation in the
merge block of the selection region.
When creating the name of an outlined region, Clang uses the file's
inode ID to generate a unique name. When the file does not exist, this
causes a fatal abort of the compiler. This PR switches to a has value
that is used instead.
---------
Co-authored-by: Michael Kruse <github@meinersbur.de>
Hi,
This patch implements support for the following directives :
- `!DIR$ NOUNROLL_AND_JAM` to disable unrolling and jamming on a DO
LOOP.
- `!DIR$ NOUNROLL` to disable unrolling on a DO LOOP.
- `!DIR$ NOVECTOR` to disable vectorization on a DO LOOP.
There is a problem with the current profitability check for
vectorization in LoopInterchange. There are both false positives and
false negatives. The former means that the heuristic may say that "an
exchange is necessary to vectorize the innermost loop" even though it's
already possible. The latter means that the heuristic may miss a case
where an exchange is necessary to vectorize the innermost loop. Note
that this is not a dependency analysis problem. This is caused by
incorrect handling of the dependency matrix in the profitability check,
so these problems can occur even if the analysis is accurate (no
overestimation).
This patch adds tests to clarify the cases that should be fixed. The
root cause of these cases is that the heuristic doesn't handle the
direction of a dependency correctly.
We had a weird, incorrect, "ConstraintEvaluator" object that was not
useful for anything, so I removed that.
I also changed the CheckConstraintSatisfaction overload that just took
an Expr* as this did not make much sense at all.
Satisfaction checking is still fairly wrong,
we do not follow the standard that requires we only substitute into the
mapping of the normal form, so we produce errors for incorrect
substitution into concepts id, even though we should not.
Split diagnosing base class qualifiers from the ``-Wignored-Qualifiers``
diagnostic group into a new ``-Wignored-base-class-qualifiers``
diagnostic group (which is grouped under ``-Wignored-qualifiers``).
Fixes#131935
C99 introduced macros of the form `INTn_C(v)` which expand to a signed
or unsigned integer constant with the specified value `v` and the type
`int_leastN_t`. Clang's initial implementation of these macros used
token pasting to form the integer constants, but this means that users
cannot define a macro named `L`, `U`, `UL`, etc before including
`<stdint.h>` (in freestanding mode, where Clang's header is being used)
because that could form invalid token pasting results. The new
definitions now use the predefined `__INTn_C` macros instead of using
token pasting. This matches the behavior of GCC.
Fixes#85995
This wasn't checking the output for all functions.
--match-full-lines is also particularly hazardous for the
interestingness checks for avoiding asserts and broken IR.
Also add tests for some of the filtered function user types.
This wasn't covered, and is overly conservative.
As noted in 1a9358c090d0507be21c5e9b2d97a23ef1de8ab0, some
simplifications can produce a redundant select where the true and false
operands are the same, which this patch removes.
The is_fpclass test was changed so the condition wasn't made dead.
The llvm.metadata section is not emitted and has special semantics. We
should not merge globals in it, similarly to how we already skip merging
of `llvm.xyz` globals.
Fixes https://github.com/llvm/llvm-project/issues/131394.
There is special logic to detect the hip runtime when llvm is installed
with Spack. It works by matching the install prefix of llvm against
`llvm-amdgpu-*` followed by effectively globbing for
```
<llvm dir>/../hip-x.y.z-*/
```
and checking there is exactly one such directory.
I would suggest to remove autodetection for the following reasons:
1. In the Spack ecosystem it's by design that every package lives in
its own prefix, and can only know where its dependencies are
installed, it has no clue what its dependents are and where they are
installed. This heuristic detection breaks that invariant, since `hip`
is a dependent of `llvm`, and can be surprising to Spack users.
2. The detection can lead to false positives, since users can be using
an llvm installed "upstream" with their own build of hip locally, and
they may not realize that clang is picking up upstream hip instead of
their local copy.
3. It only works if the directory name is `llvm-amdgpu-*` which happens
to be the name of AMD's fork of `llvm`, so it makes no sense that
this code lives in the main LLVM repo for which the Spack package
name is `llvm`. Feels wrong that LLVM knows about Spack package
names, which can change over time.
4. Users can change the install directory structure, meaning that this
detection is not robust under config changes in Spack.
During additional testing I spotted that vector deleting dtor calls
operator delete, not operator delete[] when performing array deletion.
This patch fixes that.
This PR starts the effort to upstream AMD's internal implementation of
`do concurrent` to OpenMP mapping. This replaces #77285 since we
extended this WIP quite a bit on our fork over the past year.
An important part of this PR is a document that describes the current
status downstream, the upstreaming status, and next steps to make this
pass much more useful.
In addition to this document, this PR also contains the skeleton of the
pass (no useful transformations are done yet) and some testing for the
added command line options.
This looks like a huge PR but a lot of the added stuff is documentation.
It is also worth noting that the downstream pass has been validated on
https://github.com/BerkeleyLab/fiats. For the CPU mapping, this achived
performance speed-ups that match pure OpenMP, for GPU mapping we are
still working on extending our support for implicit memory mapping and
locality specifiers.
PR stack:
- https://github.com/llvm/llvm-project/pull/126026 (this PR)
- https://github.com/llvm/llvm-project/pull/127595
- https://github.com/llvm/llvm-project/pull/127633
- https://github.com/llvm/llvm-project/pull/127634
- https://github.com/llvm/llvm-project/pull/127635
This patch adds support for memory allocation using hwloc. To enable
memory allocation using hwloc, env KMP_TOPOLOGY_METHOD=hwloc needs to be
used. If hwloc is not supported/available, allocation will fallback to
default path.
LoopInterchange has several heuristic functions to determine if
exchanging two loops is profitable or not. Whether or not to use each
heuristic and the order in which to use them were fixed, but #125830
allows them to be changed internally at will. This patch adds a new
option to control them via the compiler option.
The previous patch also added an option to prioritize the vectorization
heuristic. This patch also removes it to avoid conflicts between it and
the newly introduced one, e.g., both
`-loop-interchange-prioritize-vectorization=1` and
`-loop-interchange-profitabilities='cache,vectorization'` are specified.