To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.
This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.
In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
The old logic expects the call to be the last thing we emitted, and
since it kicks in before we emit cleanups, and since `swiftasynccall`
functions always return void, that's likely to be true. "Likely" isn't
very reassuring when we're talking about slapping attributes on random
calls, though. And indeed, while I can't find any way to break the logic
directly in current main, our previous (ongoing?) experiments with
shortening argument temporary lifetimes definitely broke it wide open.
So while this commit is prophylactic for now, it's clearly the right
thing to do, and it can cherry-picked to other branches to fix problems.
This implements the C++23 `[[assume]]` attribute.
Assumption information is lowered to a call to `@llvm.assume`, unless the expression has side-effects, in which case it is discarded and a warning is issued to tell the user that the assumption doesn’t do anything. A failed assumption at compile time is an error (unless we are in `MSVCCompat` mode, in which case we don’t check assumptions at compile time).
Due to performance regressions in LLVM, assumptions can be disabled with the `-fno-assumptions` flag. With it, assumptions will still be parsed and checked, but no calls to `@llvm.assume` will be emitted and assumptions will not be checked at compile time.
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4
The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.
RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
'serial', 'parallel', and 'kernel' constructs are all considered
'Compute' constructs. This patch creates the AST type, plus the required
infrastructure for such a type, plus some base types that will be useful
in the future for breaking this up.
The only difference between the three is the 'kind'( plus some minor
clause legalization rules, but those can be differentiated easily
enough), so rather than representing them as separate AST nodes, it
seems
to make sense to make them the same.
Additionally, no clause AST functionality is being implemented yet, as
that fits better in a separate patch, and this is enough to get the
'naked' constructs implemented.
This is otherwise an 'NFC' patch, as it doesn't alter execution at all,
so there aren't any tests. I did this to break up the review workload
and to get feedback on the layout.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
Hello!
This PR fixes#63871. Clang should no longer crash and instead emits an
error message.
Below is an example of the new error message:
```
~/dev/fork-llvm-project omp_dispatch_unimpl
❯ ./install/bin/clang -fopenmp -c -emit-llvm -Xclang -disable-llvm-passes test.c
test.c:6:5: error: cannot compile this OpenMP dispatch directive yet
6 | #pragma omp dispatch
| ^~~~~~~~~~~~~~~~~~~~
1 error generated.
```
This patch adds the CodeGen changes needed for enabling HIP parallel algorithm offload on AMDGPU targets. This change relaxes restrictions on what gets emitted on the device path, when compiling in `hipstdpar` mode:
1. Unless a function is explicitly marked `__host__`, it will get emitted, whereas before only `__device__` and `__global__` functions would be emitted;
2. Unsupported builtins are ignored as opposed to being marked as an error, as the decision on their validity is deferred to the `hipstdpar` specific code selection pass;
3. We add a `hipstdpar` specific pass to the opt pipeline, independent of optimisation level:
- When compiling for the host, iff the user requested it via the `--hipstdpar-interpose-alloc` flag, we add a pass which replaces canonical allocation / deallocation functions with accelerator aware equivalents.
A test to validate that unannotated functions get correctly emitted is added as well.
Reviewed by: yaxunl, efriedma
Differential Revision: https://reviews.llvm.org/D155850
structured-block
where clause is one of the following:
private(list)
reduction([reduction-modifier ,] reduction-identifier : list)
nowait
Differential Revision: https://reviews.llvm.org/D157933
The loop directive is a descriptive construct which allows the compiler
flexibility in how it generates code for the directive's associated
loop(s). See OpenMP specification 5.2 [257:8-9].
Codegen added in this patch for the combined 'loop' directives are:
'target teams loop' -> 'target teams distribute parallel for'
'teams loop' -> 'teams distribute parallel for'
'target parallel loop' -> 'target parallel for'
'parallel loop' -> 'parallel for'
NOTE: The implementation of the 'loop' directive itself is unchanged.
Differential Revision: https://reviews.llvm.org/D145823
* Add `Address::withElementType()` as a replacement for
`CGBuilderTy::CreateElementBitCast`.
* Partial progress towards replacing `CreateElementBitCast`, as it no
longer does what its name suggests. Either replace its uses with
`Address::withElementType()`, or remove them if no longer needed.
* Remove unused parameter 'Name' of `CreateElementBitCast`
Reviewed By: barannikov88, nikic
Differential Revision: https://reviews.llvm.org/D153196
Change the return type of `getClobbers` function from `const char*`
to `std::string_view`. Update the function usages in CodeGen module.
The reasoning of these changes is to remove unsafe `const char*`
strings and prevent unnecessary allocations for constructing the
`std::string` in usages of `getClobbers()` function.
Differential Revision: https://reviews.llvm.org/D148799
Code generation support for 'parallel masked' directive.
The `EmitOMPParallelMaskedDirective` was implemented.
In addition, the appropiate device functions were added.
Fix#59939.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D143527
Initial support for asm goto w/ outputs (D69876) only supported outputs
along the "default" (aka "fallthrough") edge.
We can support outputs along all edges by repeating the same pattern of
stores along the indirect edges that we allready do for the default
edge. One complication is that these indirect edges may be critical
edges which would need to be split. Another issue is that mid-codgen of
LLVM IR, the control flow graph might not reflect the control flow of
the final function.
To avoid this "chicken and the egg" problem assume that any given
indirect edge may become a critical edge, and pro-actively split it.
This is unnecessary if the edge does not become critical, but LLVM will
optimize such cases via tail duplication.
Fixes: https://github.com/llvm/llvm-project/issues/53562
Reviewed By: void
Differential Revision: https://reviews.llvm.org/D136497
Prerequisite to further modifications in D136497.
Basically, there is a large body of code in CodeGenFunction::EmitAsmStmt
for emitting stores of outputs. We want to be able to repeat this logic,
for each destination of a callbr (rather than just the default
destination which is what the code currently does).
Also does some smaller cleanups like whitespace cleanups, and removing
pointless casts.
Reviewed By: void, jyknight
Differential Revision: https://reviews.llvm.org/D137113
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
Generally, with PGO enabled the C++20 likelyhood attributes shall be
dropped assuming the profile has a good coverage. However, currently
this is not the case for the following code:
if (always_false()) [[likely]] {
...
}
The patch fixes this and drops the attribute, if the parent context was
executed in the profile. The patch still preserves the attribute, if the
parent context was not executed, e.g. to support the cases when the
profile has insufficient coverage.
Differential Revision: https://reviews.llvm.org/D134456
Currently, clang does not emit debuginfo for the switch stmt
case value if it is an enum value. For example,
$ cat test.c
enum { AA = 1, BB = 2 };
int func1(int a) {
switch(a) {
case AA: return 10;
case BB: return 11;
default: break;
}
return 0;
}
$ llvm-dwarfdump test.o | grep AA
$
Note that gcc does emit debuginfo for the same test case.
This patch added such a support with similar implementation
to CodeGenFunction::EmitDeclRefExprDbgValue(). With this patch,
$ clang -g -c test.c
$ llvm-dwarfdump test.o | grep AA
DW_AT_name ("AA")
$
Differential Revision: https://reviews.llvm.org/D134705
GCC inline asm document says that
"... the general rule is that the output variable must be a scalar
integer, and the value is boolean."
Commit e5c37958f901cc9bec50624dbee85d40143e4bca lowers flag output of
inline asm on X86 with setcc, hence it is guaranteed that the flag
is of boolean value. Clang does not support ARM inline asm flag output
yet so nothing need to be worried about ARM.
See "Flag Output" section at
https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands
Fixes https://github.com/llvm/llvm-project/issues/56568
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D129954
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:
; Before:
%res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
to label %asm.fallthrough [label %foo]
; After:
%res = callbr i8* asm "", "=r,r,!i"(i8* %x)
to label %asm.fallthrough [label %foo]
The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:
* Allow unrolling/peeling/rotation of callbr, or any other
clone-based optimizations
(https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
(https://github.com/llvm/llvm-project/issues/45248)
This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.
Differential Revision: https://reviews.llvm.org/D129288
This patch gives basic parsing and semantic support for
"parallel masked taskloop simd" construct introduced in
OpenMP 5.1 (section 2.16.10)
Differential Revision: https://reviews.llvm.org/D128946
This patch gives basic parsing and semantic support for
"parallel masked taskloop" construct introduced in
OpenMP 5.1 (section 2.16.9)
Differential Revision: https://reviews.llvm.org/D128834
This patch gives basic parsing and semantic support for
"masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.8)
Differential Revision: https://reviews.llvm.org/D128693
This patch gives basic parsing and semantic support for "masked taskloop"
construct introduced in OpenMP 5.1 (section 2.16.7)
Differential Revision: https://reviews.llvm.org/D128478
Instead, just pop the cleanups at the end of the asm statement.
This fixes an assertion failure in BuildStmtExpr. It also fixes a bug
where blocks and C compound literals were destructed at the end of the
asm statement instead of at the end of the enclosing scope.
Differential Revision: https://reviews.llvm.org/D125936
Adds basic parsing/sema/serialization support for the
#pragma omp target parallel loop directive.
Differential Revision: https://reviews.llvm.org/D122359
GCC supports power-of-2 size structures for the arguments. Clang supports fewer than GCC. But Clang always crashes for the unsupported cases.
This patch adds sema checks to do the diagnosts to solve these crashes.
Reviewed By: jyu2
Differential Revision: https://reviews.llvm.org/D107141
Motivation:
```
int test(int x, int y) {
int r = 0;
[[clang::always_inline]] r += foo(x, y); // force compiler to inline this function here
return r;
}
```
In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others).
This patch solves this problem with call site attribute. "noinline" statement attribute already landed in D119061. Also, some LLVM Inliner fixes landed so call site attribute is stronger than function attribute.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D120717
Clang is crashing on the following statement
char var[9];
__asm__ ("" : "=r" (var) : "0" (var));
This is similar to existing test: crbug_999160_regtest
The issue happens when EmitAsmStmt is trying to convert input to match
output type length. However, that is not guaranteed to be successful all the
time and if the statement itself is invalid like having an array type in
the example, we should give a regular error message here instead of
using assert().
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D120596
Motivation:
```
int foo(int x, int y) { // any compiler will happily inline this function
return x / y;
}
int test(int x, int y) {
int r = 0;
[[clang::noinline]] r += foo(x, y); // for some reason we don't want any inlining here
return r;
}
```
In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others).
This patch solves this problem with call site attribute. The implementation is "smaller" wrt approach which uses new intrinsics and thanks to https://reviews.llvm.org/D79121 (Add nomerge statement attribute to clang), we have got some basic infrastructure to deal with attrs on statements with call expressions.
GCC devs are more inclined to call attribute solution as well, as builtins are problematic for them - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104187. But they have no patch proposal yet so.. We have free hands here.
If this approach makes sense, next future steps would be support for call site attributes for always_inline / flatten.
Reviewed By: aaron.ballman, kuhar
Differential Revision: https://reviews.llvm.org/D119061
I noticed that the following case would compile in Clang but not GCC:
void *x(void) {
void *p = &&foo;
asm goto ("# %0\n\t# %l1":"+r"(p):::foo);
foo:;
return p;
}
Changing the output template above from %l2 would compile in GCC but not
Clang.
This demonstrates that when using tied outputs (say via the "+r" output
constraint), the hidden inputs occur or are numbered BEFORE the labels,
at least with GCC.
In fact, GCC does denote this in its documentation:
https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels
> Output operand with constraint modifier ‘+’ is counted as two operands
> because it is considered as one output and one input operand.
For the sake of compatibility, I think it's worthwhile to just make this
change.
It's better to use symbolic names for compatibility (especially now
between released version of Clang that support asm goto with outputs).
ie. %l1 from the above would be %l[foo]. The GCC docs also make this
recommendation.
Also, I cleaned up some cruft in GCCAsmStmt::getNamedOperand. AFAICT,
NumPlusOperands was no longer used, though I couldn't find which commit
didn't clean that up correctly.
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98096
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103640
Link: https://gcc.gnu.org/onlinedocs/gcc-11.2.0/gcc/Extended-Asm.html#Goto-Labels
Reviewed By: void
Differential Revision: https://reviews.llvm.org/D115471