The Pointer class already has the capability to be a function pointer,
but we still classifed function pointers as PT_FnPtr/FunctionPointer.
This means when converting from a Pointer to a FunctionPointer, we lost
the information of what the original Pointer pointed to.
When copying unions, we need to only copy the active field of the source
union, which we were already doing. However, we also need to zero out
the (now) inactive fields, so we don't end up with dangling pointers in
those inactive fields.
This patch adds `memory(argmem: read, inaccessiblemem: readwrite)
mustprogress` to **recoverable** ubsan handlers in order to unblock some
memory/loop optimizations. It provides an average of 3% performance
improvement on llvm-test-suite (except for 49 test failures due to ubsan
diagnostics).
Closes https://github.com/llvm/llvm-project/issues/130093.
https://github.com/llvm/llvm-project/pull/120464 (and earlier CLs) added -fsanitize-merge functionality, which is intended to work for all "sanitizers". It is nearly correct for CFI.
This patch precommits some tests for CFI, to track the progress of future -fsanitize-merge fixes for CFI.
Fixes#112267
Implement the shader flag analysis to set the UseNativeLowPrecision DXIL
module flag.
The flag is only able to be set when the command-line flag
`-enable-16bit-types` is passed to clang-dxc, or equivalently
`-fnative-half-type` is passed to clang.
When the command-line flag is passed, a module metadata flag called
"dx.nativelowprec" is set to 1.
The DXILShaderFlags shader flags analysis checks that the module
metadata flag "dx.nativelowprec" is set to 1 and the DXIL Version is 1.2
or greater before setting the UseNativeLowPrecision DXIL module flag.
This allows simplification of code that checks if a token is an
Objective-C keyword.
Also, delete the following in
UnwrappedLineParser::parseStructuralElement():
- an else-after-break in the tok::at case
- the copypasted code in the tok::objc_autoreleasepool case
A NestedNameSpecifier of TypeSpec kind can be non-dependent even if its
prefix is dependent, when for example the prefix is an injected class
type but the type itself is a simple alias to a non-dependent type.
This issue was a bit hard to observe because if it is an alias to a
class type, then we (for some unknown reason) ignored that the NNS was
dependent in the first place, which wouldn't happen with an enum type.
This could have been a workaround for previous dependency bugs, and is
not relevant anymore for any of the test cases in the tree, so this
patch also removes that.
The other kinds of dependencies are still relevant. If the prefix
contains an unexpanded pack, then this NNS is still unexpanded, and
likewise for errors.
This fixes a regression reported here:
https://github.com/llvm/llvm-project/pull/133610#issuecomment-2787909829
which was introduced by https://github.com/llvm/llvm-project/pull/133610
There are no release notes since the regression was never released.
Previous implementations that used the cir::LValue class omitted hanling
of the LValueBaseInfo class, which tracks information about the basis
for the LValue's alignment. As more code was upstreamed from the
incubator, we were accumulating technical debt by adding more places
where this wasn't handled correctly. This change puts the interfaces in
place to track this information.
The information being tracked isn't used yet, so no functional change is
intended. The tracking is being added now because it will become more
difficult to add it as more features are implemented.
Researching in prep of doing the implementation for lowering, I found
that the source of the valid identifiers list from flang is in the
frontend. This patch adds the same list to the frontend, but does it as
a sema diagnostic, so we still parse it as an identifier/identifier-like
thing, but then diagnose it as invalid later.
Static analysis flagged BindingDecl Decomp member as not being
initialized during construction or by any other member function.
Fix is add a default initializer for it like the other member Binding.
The single file parse mode is supposed to enter both branches of an
`#if` directive whenever the condition contains undefined identifiers.
This patch adds support for undefined function-like macros, where we
would previously emit an error that doesn't make sense from end-user
perspective.
(I discovered this while working on a very similar feature that parses
single module only and doesn't enter either `#if` branch when the
condition contains undefined identifiers.)
Closes#99135.
Tasks completed:
- Wrote implementation in `hlsl_intrinsics.h`/`hlsl_intrinsic_helpers.h`
- Added codegen tests to `clang/test/CodeGenHLSL/builtins/lit.hlsl`
This PR adds the support for treating capturing of "self" as safe if the
lambda simultaneously captures "protectedSelf", which is a RetainPtr of
"self".
This PR also fixes a bug that the checker wasn't generating a warning
when "self" is implicitly captured. Note when "self" is implicitly
captured, we use the lambda's getBeginLoc as a fallback source location.
fixes: https://github.com/llvm/llvm-project/issues/99108
Implement dst algorithm in the hlsl_intrinsics.h and added test cases
for HLSL codegen and sema
- [x] implement dst algorithm in the hlsl_intrinsics.h
- [x] Add HLSL codegen tests to clang/test/CodeGenHLSL/builtins/dst.hlsl
- [x] Add sema tests to clang/test/SemaHLSL/BuiltIns/dst-errors.hlsl
These two are very simple. they don't require any clauses and don't
have an associated statement, so they have very simple output. This
patch implements them, but none of the associated clauses.
This patch does the lowering of the OpenACC 'data' construct, which
requires getting the `default` clause (as `data` requires at least 1 of
a list of clauses, and this is the easiest one). The lowering of the
clauses appears to happen in 1 of 2 ways: a- as an operand. or b- as an
attribute.
This patch adds infrastructure to lower as an attribute, as that is how
'data' works.
In addition to that, it changes the OpenACCClauseVisitor a bit, which
previously just required that each of the derived classes have all of
the clauses covered. This patch modifies it so that the visitor directly
calls the derived class from its visitor function, which leaves the
base-class ones the ability to defer to a generic function. This was
previously like this because I had some use cases that I didn't end up
using, and the 'generic' function here seems much more useful.
This PR adds the support for recognizing the return value of copy /
mutableCopy as +1. isAllocInit and isOwned now traverses
PseudoObjectExpr to its last semantic expression.
We were passing the address of a local variable to a call to Diag()
which then tried to use the object after its lifetime ended, resulting
in crashes. We no longer pass the temporary object any longer.
Fixes#26612
SPIR-V has strict address space rules, constant globals cannot be in the
default address space.
The OMPIRBuilder change was required for lit tests to pass, we were
missing an addrspacecast.
---------
Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
Currently HIP still uses offload bundler for non-rdc mode for the new
offload driver.
This patch switches to use offload wrapper for non-device-only non-rdc
mode when new offload driver is enabled.
This makes the rdc and non-rdc compilation more consistent and speeds up
compilation since the offload wrapper supports parallel compilation for
different GPU arch's.
It is implemented by adding a linker wrapper action for each assemble
action of input file. Linker wrapper action differentiates this special
type of work vs normal linker wrapper work by the fle type. This type of
work results in object instead of image. The linker wrapper adds "-r"
for it and only includes the object file as input, not the host
libraries.
For device-only non-RDC mode, the new driver keeps the original
behavior.
This patch adds some lowering code for Compute Constructs, plus the
infrastructure to someday do clauses. Doing this requires adding the
dialect to the CIRGenerator.
This patch does not however implement/correctly initialize lowering from
OpenACC-Dialect to anything lower however.
When a module is being scanned, it can depend on modules that have
already been built from a pch dependency. When this happens, the pcm
files are reused for the module dependencies. When this is the case,
check if input files recorded from the PCMs come from the provided
stable directories transitively since the scanner will not have access
to the full set of file dependencies from prebuilt modules.
The handling for NullStmt was going to an error saying the statement
handling wasn't implemented. It doesn't need any implementation. It is
sufficient for emitSimpleStmt to just return success for that statement
class. This change does that.
Fixes#21650
---
Clang currently inserts an implicit `return 0;` in `main()` when
compiling in `C89` mode, even though the `C89` standard doesn't require
this behavior. This patch changes that behavior by emitting a warning
instead of silently inserting the implicit return under `-pedantic`.
Attr.td names the first alloc_size argument "ElemSizeParam" and the
second optional argument "NumElemsParam"; but the semantics of how the
two-argument version is used in practice is the opposite of that.
glibc declares calloc() like this, so the second alloc_size argument is
the element size:
```
extern void *calloc (size_t __nmemb, size_t __size)
__THROW __attribute_malloc__ __attribute_alloc_size__ ((1, 2)) __wur;
```
The Linux kernel declares array allocation functions like
`kmalloc_array_noprof()` the same way.
Add a comment explaining that the names used inside clang are
misleading.
Clang was previously accepting invalid code like:
```
#if 1 ? 1 : 999999999999999999999
#endif
```
because the integer constant (which is too large to fit into any
standard or extended integer type) was in an unevaluated branch of the
conditional operator. Similar invalid code involving || or && was also
accepted and is now rejected.
Fixes#134658
This adds support for handling the address of and dereference unary
operations in ClangIR code generation. This also adds handling for
nullptr and proper initialization via the NullToPointer cast.
This is the first of a few patches that will do infrastructure work to
enable the OpenACC lowering via the OpenACC dialect.
At the moment this just gets the various function calls that will end up
generating OpenACC, plus some tests to validate that we're doing the
diagnostics in OpenACC specific locations.
Additionally, this adds Stmt and Decl files for CIRGen.
This patch adds support for comparison operators with ClangIR, both
integral and floating point.
---------
Co-authored-by: Morris Hafner <mhafner@nvidia.com>
Co-authored-by: Henrich Lauko <xlauko@mail.muni.cz>
Co-authored-by: Andy Kaylor <akaylor@nvidia.com>
The component diagnostic headers (i.e. `DiagnosticAST.h` and friends)
all follow the same format, and there’s enough of them (and in them) to
where updating all of them has become rather tedious (at least it was
for me while working on #132348), so this patch instead generates all of
them (or rather their contents) via Tablegen.
Also, it seems that `%enum_select` currently wouldn’t work in
`DiagnosticCommonKinds.td` because the infrastructure for that was
missing from `DiagnosticIDs.h`; this patch should fix that as well.
#130963 switches the default to COV6, which requires ROCm 6.3.
Currently, if the
device libraries for COV6 are not found, the error message is not very
helpful.
This PR provides a more informative error message in such cases.
Summary:
These two tools do the same thing, we should unify them into a single
tool. We create symlinks for backward compatiblity and provide a way to
get the old vendor specific behavior with `--amdgpu-only` and
`--nvptx-only`.
Patches in the Key Instructions (KeyInstr) stack need to access CGF in these
functions. 2 CGF fields are passed to these functions already; at this point it
felt natural to promote them to CGF methods.
This feature is currently not supported in the compiler.
To facilitate this we emit a stub version of each kernel
function body with different name mangling scheme, and
replaces the respective kernel call-sites appropriately.
Fixes https://github.com/llvm/llvm-project/issues/60313
D120566 was an earlier attempt made to upstream a solution
for this issue.
---------
Co-authored-by: anikelal <anikelal@amd.com>