See test comment for possible future improvement.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
The Key Instructions project is introduced, including a "quick summary" section
at the top which adds context for this PR, here:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
The Clang-side work is demoed here:
https://github.com/llvm/llvm-project/pull/130943
See test comment for possible future improvement.
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
The Key Instructions project is introduced, including a "quick summary" section
at the top which adds context for this PR, here:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
The Clang-side work is demoed here:
https://github.com/llvm/llvm-project/pull/130943
This needs to be driver level to pass an -mllvm flag to LLVM.
Keep the flag help-hidden as the feature is under development.
---
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.
The Key Instructions project is introduced, including a "quick summary" section
at the top which adds context for this PR, here:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668
The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
The Clang-side work is demoed here:
https://github.com/llvm/llvm-project/pull/130943
In the wake of discussion in PR #131200 and internal discussion after,
we will add support for `LLVM_ENABLE_PER_TARGET_RUNTIME=ON` for AIX
instead of disable it. I already reverted the change in PR #131200.
The default value of the option is still OFF on AIX.
Summary:
When we were first porting to COV5, this lead to some ABI issues due to
a change in how we looked up the work group size. Bitcode libraries
relied on the builtins to emit code, but this was changed between
versions. This prevented the bitcode libraries, like OpenMP or libc,
from being used for both COV4 and COV5. The solution was to have this
'none' functionality which effectively emitted code that branched off of
a global to resolve to either version.
This isn't a great solution because it forced every TU to have this
variable in it. The patch in
https://github.com/llvm/llvm-project/pull/131033 removed support for
COV4 from OpenMP, which was the only consumer of this functionality.
Other users like HIP and OpenCL did not use this because they linked the
ROCm Device Library directly which has its own handling (The name was
borrowed from it after all).
So, now that we don't need to worry about backward compatibility with
COV4, we can remove this special handling. Users can still emit COV4
code, this simply removes the special handling used to make the OpenMP
device runtime bitcode version agnostic.
When retrieving the ExplicitSpecifier from a CXXConstructorDecl, one of
its canonical declarations is returned. To correctly write the
declaration record the ExplicitSpecifier of the current declaration must
be used.
Failing to do so results in a crash during deserialization.
Previously `optin.taint.GenericTaint` misinterpreted the parameter
indices and produced false positives in situations when a [format
attribute](https://clang.llvm.org/docs/AttributeReference.html#format)
is applied on a non-static method. This commit fixes this bug
Relands #132907 with a fix in the testcase:
clang/test/CodeGen/Mips/subtarget-feature-test.c
enable this test for only mips64 target
PR #130587 defined same SubTargetFeature for CPUs i6400 and i6500 which
resulted into following warning when -mcpu=i6500 was used:
+i6500' is not a recognized feature for this target (ignoring feature)
This PR fixes above issue by defining separate SubTargetFeature for
i6500.
Fixes a regression introduced in
https://github.com/llvm/llvm-project/pull/130537 and reported here
https://github.com/llvm/llvm-project/issues/133144
This fixes a crash in ASTStructuralEquivalence where the non-null
precondition for IsStructurallyEquivalent would be violated, when
comparing member pointers with a dependent class.
This also drive-by fixes the ast node traverser for member pointers so
it doesn't traverse into the qualifier in case it's not a type, or the
class declaration in case it would be equivalent to what the qualifier
refers.
This avoids printing of `<<<NULL>>>` on the text node dumper, which is
redundant.
No release notes since the regression was never released.
Fixes https://github.com/llvm/llvm-project/issues/133144
Summary:
The https://github.com/llvm/llvm-project/pull/128509 patch introduced
`--flto-partitions`. This was marked as a HIP only argument, and was
also spelled and handled incorrectly for an `-f` option. This patch
makes the handling generic for `ld.lld` consumers.
This also fixes some issues with emitting the flags being put after the
default arguments, preventing users from overriding them. Also, forwards
things properly for the new driver so we can test this.
Commit 20b7f5982622f includes a case that checks diagnostics for for
loops using thread locals.
This fails on platforms which do not support TLS.
This change adds guards to run this part of the test iff the feature is
supported.
The main goal of this patch is to improve the
performance of concept subsumption by
- Making sure literal (atomic) clauses are de-duplicated (Whether 2
atomic constraint is established during the initial normal form
production).
- Eagerly removing duplicated clauses.
This should minimize the risks of exponentially large formulas that can
be produced by a naive {C,D}NF transformation.
While at it, I restructured that part of the code to be a bit clearer.
Subsumption of fold expanded constraint is also cached.
---
Note that removing duplicated clauses seems to be necessary and
sufficient to have acceptable performance on anything that could be
construed as reasonable code.
Ultimately, the number of clauses is always going to be fairly small
(but $2^{fairly\ small}$ is quickly *fairly large*..).
I went too far in the rabbit hole of Tseitin transformations etc, which
was much faster but would then require to check satisfiabiliy to
establish subsumption between some constraints (although it was good
enough to pass all but ones of our tests...).
It doesn't help that the C++ standard has a very specific definition of
subsumption that is really more of an implication...
While that sort of musing is fascinating, it was ultimately a fool's
errand, at least until such time that there is more motivation for a SAT
solver in clang (clang-tidy can after all use z3!).
Here be dragons.
Fixes#122581
C2y adds the `_Countof` operator which returns the number of elements in
an array. As with `sizeof`, `_Countof` either accepts a parenthesized
type name or an expression. Its operand must be (of) an array type. When
passed a constant-size array operand, the operator is a constant
expression which is valid for use as an integer constant expression.
This is being exposed as an extension in earlier C language modes, but
not in C++. C++ already has `std::extent` and `std::size` to cover these
needs, so the operator doesn't seem to get the user enough benefit to
warrant carrying this as an extension.
Fixes#102836
Add int overloads which cast the various ints to a float and call the
float builtin.
These overloads are conditional on hlsl version 202x or earlier.
Add tests and puts tests in own files, including some of the tests added
for double overloads.
Closes#128229
Summary:
The default behavior for LTO on other targets does not specify the
number of LTO partitions. Recent changes made this default to 8 on
AMDGPU which had some issues with the `libc` project. The option to
disable this is HIP only so I think for now we should restrict this just
to HIP.
I'm definitely on board with getting some more parallelism here, but I
think it should probably be restricted to just offloading languages. The
new driver goes through the `--target=amdgcn-amd-amdhsa` for its output,
which means we'd need to forward the default somehow.
The instantiation of a VarDecl's initializer might be deferred until the
variable is actually used. However, we were still building the
DeclRefExpr with a type that could later be changed by the initializer's
instantiation, which is incorrect when incomplete arrays are involved.
Fixes#79750Fixes#113936Fixes#133047
When pragma of loop transformations is specified, follow-up metadata for
loops is generated after each transformation. On the LLVM side,
follow-up metadata is expected to be a list of properties, such as the
following:
```
!followup = !{!"llvm.loop.vectorize.followup_all", !mp, !isvectorized}
!mp = !{!"llvm.loop.mustprogress"}
!isvectorized = !{"llvm.loop.isvectorized"}
```
However, on the clang side, the generated metadata contains an MDNode
that has those properties, as shown below:
```
!followup = !{!"llvm.loop.vectorize.followup_all", !loop_id}
!loop_id = distinct !{!loop_id, !mp, !isvectorized}
!mp = !{!"llvm.loop.mustprogress"}
!isvectorized = !{"llvm.loop.isvectorized"}
```
According to the
[LangRef](https://llvm.org/docs/TransformMetadata.html#transformation-metadata-structure),
the LLVM side is correct. Due to this inconsistency, follow-up metadata
was not interpreted correctly, e.g., only one transformation is applied
when multiple pragmas are used.
This patch fixes clang side to emit followup metadata in correct format.
PR #130587 defined same SubTargetFeature for CPUs i6400 and i6500 which
resulted into following warning when -mcpu=i6500 was used:
+i6500' is not a recognized feature for this target (ignoring feature)
This PR fixes above issue by defining separate SubTargetFeature for
i6500.
This PR fixes two crashes in ExtractAPI that occur when decls are
requested via libclang:
- A null-dereference would sometimes happen in
`DeclarationFragmentsBuilder::getFragmentsForClassTemplateSpecialization`
when the template being processed was loaded indirectly via a typedef,
with parameters filled in. The first commit loads the template parameter
locations ahead of time to perform a null check before dereferencing.
- An assertion (or another null-dereference) was happening in
`CXXRecordDecl::bases` when processing a forward-declaration (i.e. a
record without a definition). The second commit guards the use of
`bases` in `ExtractAPIVisitorBase::getBases` by first checking that the
decl in question has a complete definition.
The added test `extract-api-cursor-cpp` adds tests for these two
scenarios to protect against the crash in the future.
Fixes rdar://140592475, fixes rdar://123430367
Lowering of int-to-bool casts had been left as NYI because the incubator
implemented it by lowering to cir.cmp, which hasn't been upstreamed yet,
but there is no reason this cast can't be lowered directly to LLVM's
compare operation when we're lowering directly to the LLVM dialect.
This change lowers the cast directly to an LLVM compare to zero.