This PR exposes `cloneForModuleCompile()` as a public `CompilerInstance`
member function. This will be eventually used in the dependency scanner
to customize implicit module builds.
Fixes what seems to be a buggy warning in MSVC:
```
[1/37] Building CXX object tools\clang\lib\Sema\CMakeFiles\obj.clangSema.dir\SemaConcept.cpp.obj
C:\git\llvm-project\clang\lib\Sema\SemaConcept.cpp(1933): warning C4101: '$S26': unreferenced local variable
```
Fixes:
```
[113/324] Building CXX object tools\clang\unittests\AST\ByteCode\CMakeFiles\InterpTests.dir\BitcastBuffer.cpp.obj
C:\git\llvm-project\clang\unittests\AST\ByteCode\BitcastBuffer.cpp(52): warning C4309: 'initializing': truncation of constant value
C:\git\llvm-project\clang\unittests\AST\ByteCode\BitcastBuffer.cpp(53): warning C4309: 'initializing': truncation of constant value
```
This fixes a PCM non-determinism regression reported here:
https://github.com/llvm/llvm-project/pull/134560#issuecomment-2797744370
There was a bit in `SubstNonTypeTemplateParmPackExpr` which we missed to
serialize, and that bit eventually propagates to
`SubstNonTypeTemplateParmExpr`.
As a drive by, improve serialization for PackIndex on
SubstNonTypeTemplateParmExpr by using the newly introduced
UnsignedOrNone helpers.
There are no release notes since this regression was never released.
The calleeDecl var will be used in the near future, so I left it. At
least for clang, the [[maybe_unused]] attribute takes care of the
warnings related to that variable. The other warning was a simple lack
of return after errorNYI.
This patch adds `VisitBinAssign` and `VisitBinComma` to the ClangIR
`ScalarExprEmitter` to enable assignments and the comma operator.
---------
Co-authored-by: Morris Hafner <mhafner@nvidia.com>
`Sema::getCurFunctionDecl(AllowLambda = false)` returns a nullptr when
the lambda declaration is outside a function (for example, when
assigning a lambda to a static constexpr variable).
This triggered an assertion in
`SemaAMDGPU::CheckAMDGCNBuiltinFunctionCall`.
Using `Sema::getCurFunctionDecl(AllowLambda = true)` returns the
declaration of the enclosing lambda.
Stumbled with this issue when refactoring some code in CK.
This PR extracts the creation of `CompilerInstance` for compiling an
implicitly-discovered module out of `compileModuleImpl()` into its own
separate function and passes it into `compileModuleImpl()` from the
outside. This makes the instance creation logic reusable (useful for my
experiments) and also simplifies the API, removing the `PreBuildStep`
and `PostBuildStep` hooks from `compileModuleImpl()`.
Static analysis flagged 1 - ArgIdx in Sema::AddOverloadCandidate for its
potential to overflow.
Turns out this is intentional since when PO ==
OverloadCandidateParamOrder::Reversed Args.size() is always two, so this
will never overflow.
We document using an assert.
Fixes: https://github.com/llvm/llvm-project/issues/135086
With implicitly-built modules, seeing something like:
```
fatal error: module 'X' is defined in both '<cache>/HASH1/X-HASH2.pcm' and '<cache>/HASH1/X-HASH3.pcm'
```
is super confusing and not actionable, because the module cache tends to
be hidden from the developer.
This PR adds a note to that diagnostic that names the module map files
the PCM files were compiled from, hopefully giving a good enough hint
for further investigation:
```
note: compiled from '<build>/X.framework/Modules/module.modulemap' and '<SDK>/X.framework/Modules/module.modulemap'
```
(I had to replace the mechanism used to convert `DiagnosticError` into
something `DiagnosticsEngine` can understand, because it seemingly did
not support notes.)
Currently rocm-device-lib-path is not enabled for Flang, so when the
compiler warns / requests a user to provide this option in cases where
it can't find rocm a user cannot actually set the device libraries using
rocm-device-lib-path. The alternative rocm_path that's also mentioned
via the warning can be used, but we should enable both mentioned options
to not confuse users (and myself).
One was "unsafe use of bool" and the other was "sign comparision
mismatch", and both were because we're treating a bool object as if it
were an unsigned int. Add a cast to make that more explicit.
This PR implements a CC1 flag `-dump-minimization-hints`.
The flag allows to specify a file path to dump ranges of deserialized
declarations in `ASTReader`. Example usage:
```
clang -Xclang=-dump-minimization-hints=/tmp/decls -c file.cc -o file.o
```
Example output:
```
// /tmp/decls
{
"required_ranges": [
{
"file": "foo.h",
"range": [
{
"from": {
"line": 26,
"column": 1
},
"to": {
"line": 27,
"column": 77
}
}
]
},
{
"file": "bar.h",
"range": [
{
"from": {
"line": 30,
"column": 1
},
"to": {
"line": 35,
"column": 1
}
},
{
"from": {
"line": 92,
"column": 1
},
"to": {
"line": 95,
"column": 1
}
}
]
}
]
}
```
Specifying the flag creates an instance of
`DeserializedDeclsSourceRangePrinter`, which dumps ranges of deserialized
declarations to aid debugging and bug minimization (we use is as input to [C-Vise](https://github.com/emaxx-google/cvise/tree/multifile-hints).
Required ranges are computed from source ranges of Decls.
`TranslationUnitDecl`, `LinkageSpecDecl` and `NamespaceDecl` are ignored
for the sake of this PR.
Technical details:
* `DeserializedDeclsSourceRangePrinter` implements `ASTConsumer` and
`ASTDeserializationListener`, so that an object of
`DeserializedDeclsSourceRangePrinter` registers as its own listener.
* `ASTDeserializationListener` interface provides the `DeclRead`
callback that we use to collect the deserialized Decls.
Printing or otherwise processing them as this point is dangerous, since
that could trigger additional deserialization and crash compilation.
* The collected Decls are processed in `HandleTranslationUnit` method of
`ASTConsumer`. This is a safe point, since we know that by this point
all the Decls needed by the compiler frontend have been deserialized.
* In case our processing causes further deserialization, `DeclRead` from
the listener might be called again. However, at that point we don't
accept any more Decls for processing.
Fix comparing type id pointers, add mor info when print()ing them, use
the most derived type in GetTypeidPtr() and the canonically unqualified
type when we know the type statically.
This is a follow-up to https://github.com/llvm/llvm-project/pull/131074.
After moving the default argument heuristic to `simplifyType` in that
patch, the heuristic no longer applied to the
`DependentScopeDeclRefExpr` case, because that wasn't using
`simplifyType`.
This patch fixes that, with an added testcase.
... to clarify ownership, aligning with other parameters. Using
`std::unique_ptr` encourages users to manage `createMCInstPrinter` with
a unique_ptr instead of a raw pointer, reducing the risk of memory
leaks.
* llvm-mc: fix a leak and update llvm/test/tools/llvm-mc/disassembler-options.test
* #121078 copied the llvm-mc code to CodeGenTargetMachineImpl and made
the same mistake. Fixed by 2b8cc651dca0c000ee18ec79bd5de4826156c9d6
Using unique_ptr requires #include MCInstPrinter.h in a few translation
units.
* Delete a createAsmStreamer overload I deprecated in 2024
* SystemZMCTargetDesc.cpp: rename to `createSystemZAsmStreamer` to fix
an overload conflict.
Pull Request: https://github.com/llvm/llvm-project/pull/135128
The NVVMReflect pass simply replaces calls to nvvm-reflect functions
with the appropriate constant, either the architecture number, or
nvvm-reflect-ftz, found in the module's metadata.
The implementation is inefficient and does this by traversing through
all instructions to find calls. The common case is that you never call
nvvm-reflect, so this traversal is costly.
This PR:
- Updates the pass so that it finds the reflect functions by name, and
then traverses through their uses to find the calls directly.
- Adds a line (245) to make sure the dead nvvm-reflect definitions are
erased.
- Adds the ability to set reflect values via command line
In the LLVM middle-end we want to fold `gep inbounds null, idx -> null`:
https://alive2.llvm.org/ce/z/5ZkPx-
This pattern is common in real-world programs
(https://github.com/dtcxzyw/llvm-opt-benchmark/pull/55#issuecomment-1870963906).
Generally, it exists in some (actually) unreachable blocks, which is
introduced by JumpThreading.
However, some old-style offsetof macros are still widely used in
real-world C/C++ code (e.g., hwloc/slurm/luajit). To avoid breaking
existing code and inconvenience to downstream users, this patch removes
the inbounds flag from the struct gep if the base pointer is null.
The 'set' lowering is pretty trivial. 'device_type' is a little more
restricted since both the MLIR-Dialect and language limit it to only 1
value (as confirmed by standards-discussion).
This patch implements 'set', with 'device_type', since 'set' requires at
least 1 clause, and this is the least difficult to implement at the
moment.
This is a basic implementation of P2719: "Type-aware allocation and
deallocation functions" described at http://wg21.link/P2719
The proposal includes some more details but the basic change in
functionality is the addition of support for an additional implicit
parameter in operators `new` and `delete` to act as a type tag. Tag is
of type `std::type_identity<T>` where T is the concrete type being
allocated. So for example, a custom type specific allocator for `int`
say can be provided by the declaration of
void *operator new(std::type_identity<int>, size_t, std::align_val_t);
void operator delete(std::type_identity<int>, void*, size_t, std::align_val_t);
However this becomes more powerful by specifying templated declarations,
for example
template <typename T> void *operator new(std::type_identity<T>, size_t, std::align_val_t);
template <typename T> void operator delete(std::type_identity<T>, void*, size_t, std::align_val_t););
Where the operators being resolved will be the concrete type being
operated over (NB. A completely unconstrained global definition as above
is not recommended as it triggers many problems similar to a general
override of the global operators).
These type aware operators can be declared as either free functions or
in class, and can be specified with or without the other implicit
parameters, with overload resolution performed according to the existing
standard parameter prioritisation, only with type parameterised
operators having higher precedence than non-type aware operators. The
only exception is destroying_delete which for reasons discussed in the
paper we do not support type-aware variants by default.
Fixes#112270
Completed ACs:
- `-res-may-alias` clang-dxc command-line option added
- It inserts and sets a module metadata flag `dx.resmayalias` to 1
- Shader flag set appropriately:
- The flag IS NOT set if DXIL Version <= 1.6 OR the command-line option
`-res-may-alias` is specified
- Otherwise the flag IS set when:
- DXIL Version > 1.7 AND function uses UAVs, OR
- DXIL Version <= 1.7 AND UAVs present globally
- Add tests
- Tests for Shader Models 6.6, 6.7, and 6.8 corresponding to DXIL
Versions 1.6, 1.7, and 1.8
- Tests (`res-may-alias-0.ll`/`res-may-alias-1.ll`) for when the module
metadata flag `dx.resmayalias` is set to 0 or 1 respectively
- A frontend test (`res-may-alias.hlsl`) for testing that that the
command-line option `-res-may-alias` inserts `dx.resmayalias` module
metadata correctly
This PR fixes a bug that when a template specialization is declared with
a forward declaration of a template, the checker fails to find its
definition in the same translation unit and erroneously emit an unsafe
forward declaration warning.
This PR adds the support for recognizing calling adoptCF/adoptNS on the
result of a cast operation on the return value of a function which
creates NS or CF types. It also fixes a bug that we weren't reporting
memory leaks when CF types are created without ever calling RetainPtr's
constructor, adoptCF, or adoptNS.
To do this, this PR adds a new mechanism to report a memory leak
whenever create or copy CF functions are invoked unless this CallExpr
has already been visited while validating a call to adoptCF. Also added
an early exit when isOwned returns IsOwnedResult::Skip due to an
unresolved template argument.
The recently announced IBM z17 processor implements the architecture
already supported as "arch15" in LLVM. This patch adds support for "z17"
as an alternate architecture name for arch15.
This patch also add the scheduler description for the z17 processor,
provided by Jonas Paulsson.
Discussions with the OpenACC Standard folks and the dialect folks showed
that the ability to have 'set' have a 'device_type' with more than one
architecture was a mistake, and one that will be fixed in future
revisions of the standard. Since the dialect requires this anyway,
we'll implement this in advance of standardization.
We execute tests in read only environment which leads to test failure
when tests try to write to the current directory. Either they should
write to a temporary directory or not write if output is not needed.
Fallback from #134717
…utdown'
This patch emits the lowering for 'device_type' on an 'init' or
'shutdown'. This one is fairly unique, as these directives have it as an
attribute, rather than as a component of the individual operands, like
the rest of the constructs.
So this patch implements the lowering as an attribute.
In order to do tis, a few refactorings had to happen: First, the
'emitOpenACCOp' functions needed to pick up th edirective kind/location
so that the NYI diagnostic could be reasonable.
Second, and most impactful, the `applyAttributes` function ends up
needing to encode some of the appertainment rules, thanks to the way the
OpenACC-MLIR operands get their attributes attached. Since they each use
a special function (rather than something that can be legalized at
runtime), the forms of 'setDefaultAttr' is only valid for some ops. SO
this patch uses some `if constexpr` and a small type-trait to help
legalize these.
In the 'single-file-parse' mode, seeing `#include UNDEFINED_IDENTIFIER`
should not be treated as an error. The identifier might be defined in a
header that we decided to skip, resulting in a nonsensical diagnostic
from the user point of view.
fixes#135122
SemaExpr.cpp - Make all doubles fail. Add sema support for float scalars
and vectors when language mode is HLSL.
CGExprScalar.cpp - Allow emit frem when language mode is HLSL.
Update the documentation for the unsafe_buffer_usage attribute to
capture the new behavior introduced by
https://github.com/llvm/llvm-project/pull/125671
Co-authored-by: MalavikaSamak <malavika2@apple.com>