3658 Commits

Author SHA1 Message Date
antangelo
f94c481543
[clang] Track source deduction guide for alias template deduction guides (#123875)
For deduction guides generated from alias template CTAD, store the
deduction guide they were originated from. The source kind is also
maintained for future expansion in CTAD from inherited constructors.

This tracking is required to determine whether an alias template already
has a deduction guide corresponding to some deduction guide on the
original template, in order to support deduction guides for the alias
from deduction guides declared after the initial usage.
2025-01-27 18:59:12 -05:00
Dmitry Polukhin
cad6bbade0
[C++20][Modules] Fix crash/compiler error due broken AST links (#123648)
Summary:
This PR fixes bugreport
https://github.com/llvm/llvm-project/issues/122493 The root problem is
the same as before lambda function and DeclRefExpr references a variable
that does not belong to the same module as the enclosing function body.
Therefore iteration over the function body doesn’t visit the VarDecl.
Before this change RelatedDeclsMap was created only for canonical decl
but in reality it has to be done for the definition of the function that
does not always match the canonical decl.

Test Plan: check-clang
2025-01-23 10:35:58 +00:00
Tom Honermann
8fb42300a0
[SYCL] AST support for SYCL kernel entry point functions. (#122379)
A SYCL kernel entry point function is a non-member function or a static
member function declared with the `sycl_kernel_entry_point` attribute.
Such functions define a pattern for an offload kernel entry point
function to be generated to enable execution of a SYCL kernel on a
device. A SYCL library implementation orchestrates the invocation of
these functions with corresponding SYCL kernel arguments in response to
calls to SYCL kernel invocation functions specified by the SYCL 2020
specification.

The offload kernel entry point function (sometimes referred to as the
SYCL kernel caller function) is generated from the SYCL kernel entry
point function by a transformation of the function parameters followed
by a transformation of the function body to replace references to the
original parameters with references to the transformed ones. Exactly how
parameters are transformed will be explained in a future change that
implements non-trivial transformations. For now, it suffices to state
that a given parameter of the SYCL kernel entry point function may be
transformed to multiple parameters of the offload kernel entry point as
needed to satisfy offload kernel argument passing requirements.
Parameters that are decomposed in this way are reconstituted as local
variables in the body of the generated offload kernel entry point
function.

For example, given the following SYCL kernel entry point function
definition:
```
template<typename KernelNameType, typename KernelType>
[[clang::sycl_kernel_entry_point(KernelNameType)]]
void sycl_kernel_entry_point(KernelType kernel) {
  kernel();
}
```

and the following call:
```
struct Kernel {
  int dm1;
  int dm2;
  void operator()() const;
};
Kernel k;
sycl_kernel_entry_point<class kernel_name>(k);
```

the corresponding offload kernel entry point function that is generated
might look as follows (assuming `Kernel` is a type that requires
decomposition):
```
void offload_kernel_entry_point_for_kernel_name(int dm1, int dm2) {
  Kernel kernel{dm1, dm2};
  kernel();
}
```

Other details of the generated offload kernel entry point function, such
as its name and calling convention, are implementation details that need
not be reflected in the AST and may differ across target devices. For
that reason, only the transformation described above is represented in
the AST; other details will be filled in during code generation.

These transformations are represented using new AST nodes introduced
with this change. `OutlinedFunctionDecl` holds a sequence of
`ImplicitParamDecl` nodes and a sequence of statement nodes that
correspond to the transformed parameters and function body.
`SYCLKernelCallStmt` wraps the original function body and associates it
with an `OutlinedFunctionDecl` instance. For the example above, the AST
generated for the `sycl_kernel_entry_point<kernel_name>` specialization
would look as follows:
```
FunctionDecl 'sycl_kernel_entry_point<kernel_name>(Kernel)'
  TemplateArgument type 'kernel_name'
  TemplateArgument type 'Kernel'
  ParmVarDecl kernel 'Kernel'
  SYCLKernelCallStmt
    CompoundStmt
      <original statements>
    OutlinedFunctionDecl
      ImplicitParamDecl 'dm1' 'int'
      ImplicitParamDecl 'dm2' 'int'
      CompoundStmt
        VarDecl 'kernel' 'Kernel'
          <initialization of 'kernel' with 'dm1' and 'dm2'>
        <transformed statements with redirected references of 'kernel'>
```

Any ODR-use of the SYCL kernel entry point function will (with future
changes) suffice for the offload kernel entry point to be emitted. An
actual call to the SYCL kernel entry point function will result in a
call to the function. However, evaluation of a `SYCLKernelCallStmt`
statement is a no-op, so such calls will have no effect other than to
trigger emission of the offload kernel entry point.

Additionally, as a related change inspired by code review feedback,
these changes disallow use of the `sycl_kernel_entry_point` attribute
with functions defined with a _function-try-block_. The SYCL 2020
specification prohibits the use of C++ exceptions in device functions.
Even if exceptions were not prohibited, it is unclear what the semantics
would be for an exception that escapes the SYCL kernel entry point
function; the boundary between host and device code could be an implicit
noexcept boundary that results in program termination if violated, or
the exception could perhaps be propagated to host code via the SYCL
library. Pending support for C++ exceptions in device code and clear
semantics for handling them at the host-device boundary, this change
makes use of the `sycl_kernel_entry_point` attribute with a function
defined with a _function-try-block_ an error.
2025-01-22 16:39:08 -05:00
Ilya Biryukov
f63e8ed16e Revert "[Modules] Delay deserialization of preferred_name attribute at r… (#122726)"
This reverts commit c3ba6f378ef80d750e2278560c6f95a300114412.

We are seeing performance regressions of up to 40% on some compilations
with this patch, we will investigate and reland after fixing performance
issues.
2025-01-22 18:17:37 +01:00
Chuanqi Xu
05861b39ba
[C++20] [Modules] Make sure vtable are generated for explicit template instantiation definition (#123871)
Close https://github.com/llvm/llvm-project/issues/123719

The reason is, we thought the external explicit template instantiation
declaration as the external definition incorrectly.
2025-01-22 12:30:31 +08:00
Chuanqi Xu
fb2c9d940a
[C++20] [Modules] Makes sure internal declaration won't be found by other TU (#123059)
Close https://github.com/llvm/llvm-project/issues/61427

And this is also helpful to implement
https://github.com/llvm/llvm-project/issues/112294 partially.

The implementation strategy mimics
https://github.com/llvm/llvm-project/pull/122887. This patch split the
internal declarations from the general lookup table so that other TU
can't find the internal declarations.
2025-01-17 21:03:53 +08:00
Viktoriia Bakalova
c3ba6f378e
[Modules] Delay deserialization of preferred_name attribute at r… (#122726)
…ecord level.

This fixes the incorrect diagnostic emitted when compiling the following
snippet

```
// string_view.h
template<class _CharT>
class basic_string_view;

typedef basic_string_view<char> string_view;

template<class _CharT>
class
__attribute__((__preferred_name__(string_view)))
basic_string_view {
public:
    basic_string_view() 
    {
    }
};

inline basic_string_view<char> foo()
{
  return basic_string_view<char>();
}
// A.cppm
module;
#include "string_view.h"
export module A;

// Use.cppm
module;
#include "string_view.h"
export module Use;
import A;
```

The diagnostic is 
```
string_view.h:11:5: error: 'basic_string_view<char>::basic_string_view' from module 'A.<global>' is not present in definition of 'string_view' provided earlier
```

The underlying issue is that deserialization of the `preferred_name`
attribute triggers deserialization of `basic_string_view<char>`, which
triggers the deserialization of the `preferred_name` attribute again
(since it's attached to the `basic_string_view` template).
The deserialization logic is implemented in a way that prevents it from
going on a loop in a literal sense (it detects early on that it has
already seen the `string_view` typedef when trying to start its
deserialization for the second time), but leaves the typedef
deserialization in an unfinished state. Subsequently, the `string_view`
typedef from the deserialized module cannot be merged with the same
typedef from `string_view.h`, resulting in the above diagnostic.

This PR resolves the problem by delaying the deserialization of the
`preferred_name` attribute until the deserialization of the
`basic_string_view` template is completed. As a result of deferring, the
deserialization of the `preferred_name` attribute doesn't need to go on
a loop since the type of the `string_view` typedef is already known when
it's deserialized.
2025-01-17 09:10:58 +01:00
Chuanqi Xu
c5e4afe673
[C++20] [Modules] Support module level lookup (#122887) (#123281)
Close https://github.com/llvm/llvm-project/issues/90154

This patch is also an optimization to the lookup process to utilize the
information provided by `export` keyword.

Previously, in the lookup process, the `export` keyword only takes part
in the check part, it doesn't get involved in the lookup process. That
said, previously, in a name lookup for 'name', we would load all of
declarations with the name 'name' and check if these declarations are
valid or not. It works well. But it is inefficient since it may load
declarations that may not be wanted.

Note that this patch actually did a trick in the lookup process instead
of bring module information to DeclarationName or considering module
information when deciding if two declarations are the same. So it may
not be a surprise to me if there are missing cases. But it is not a
regression. It should be already the case. Issue reports are welcomed.

In this patch, I tried to split the big lookup table into a lookup table
as before and a module local lookup table, which takes a combination of
the ID of the DeclContext and hash value of the primary module name as
the key. And refactored `DeclContext::lookup()` method to take the
module information. So that a lookup in a DeclContext won't load
declarations that are local to **other** modules.

And also I think it is already beneficial to split the big lookup table
since it may reduce the conflicts during lookups in the hash table.

BTW, this patch introduced a **regression** for a reachability rule in
C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.

This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup
table should be empty for them.
2025-01-17 13:41:44 +08:00
Chuanqi Xu
263fed7ce9
[AST] Add OriginalDC argument to ExternalASTSource::FindExternalVisibleDeclsByName (#123152)
Part for relanding https://github.com/llvm/llvm-project/pull/122887.

I split this to test where the performance regession comes from if
modules are not used.
2025-01-17 12:46:00 +08:00
Jorge Gorbe Moya
d2d531e097
[clang][Serialization] Stop including Frontend headers from Serialization (NFC) (#123140)
The Frontend library depends on Serialization. This is an explicit
dependency encoded in the CMake target. However, Serialization currently
has an implicit dependency on Frontend, as it includes one of its
headers. This is not reflected in the CMake build rules, but Bazel is
stricter so, in order to avoid a dependency cycle, it hackily declares
the Frontend headers as source files for Serialization.

Fortunately, the only Frontend header used by Serialization is
clang/Frontend/FrontendDiagnostic.h, which is a legacy header that just
includes clang/Basic/DiagnosticFrontend since
d076608d58d1ec55016eb747a995511e3a3f72aa, back in 2018.

This commit changes Serialization to use the underlying header from
Basic instead. Both Serialization and Frontend depend on Basic, so this
breaks the dependency cycle.
2025-01-16 10:12:04 -08:00
Chuanqi Xu
731db2a03e Revert "[C++20] [Modules] Support module level lookup (#122887)"
This reverts commit 7201cae106260aeb3e9bbbb7d5291ff30f05076a.
2025-01-16 10:23:11 +08:00
Steven Wu
18650480cb
[clang][Serialization] Add the missing block info (#122976)
HEADER_SEARCH_ENTRY_USAGE and VFS_USAGE were missing from the block info
block. Add the missing info so `llvm-bcanalyzer` can read them
correctly.
2025-01-15 09:58:23 -08:00
Chuanqi Xu
7201cae106
[C++20] [Modules] Support module level lookup (#122887)
Close https://github.com/llvm/llvm-project/issues/90154

This patch is also an optimization to the lookup process to utilize the
information provided by `export` keyword.

Previously, in the lookup process, the `export` keyword only takes part
in the check part, it doesn't get involved in the lookup process. That
said, previously, in a name lookup for 'name', we would load all of
declarations with the name 'name' and check if these declarations are
valid or not. It works well. But it is inefficient since it may load
declarations that may not be wanted.

Note that this patch actually did a trick in the lookup process instead
of bring module information to DeclarationName or considering module
information when deciding if two declarations are the same. So it may
not be a surprise to me if there are missing cases. But it is not a
regression. It should be already the case. Issue reports are welcomed.

In this patch, I tried to split the big lookup table into a lookup table
as before and a module local lookup table, which takes a combination of
the ID of the DeclContext and hash value of the primary module name as
the key. And refactored `DeclContext::lookup()` method to take the
module information. So that a lookup in a DeclContext won't load
declarations that are local to **other** modules.

And also I think it is already beneficial to split the big lookup table
since it may reduce the conflicts during lookups in the hash table.

BTW, this patch introduced a **regression** for a reachability rule in
C++20 but it was false-negative. See
'clang/test/CXX/module/module.interface/p7.cpp' for details.

This patch is not expected to introduce any other
regressions for non-c++20-modules users since the module local lookup
table should be empty for them.

---

On the API side, this patch unfortunately add a maybe-confusing argument
`Module *NamedModule` to
`ExternalASTSource::FindExternalVisibleDeclsByName()`. People may think
we can get the information from the first argument `const DeclContext
*DC`. But sadly there are declarations (e.g., namespace) can appear in
multiple different modules as a single declaration. So we have to add
additional information to indicate this.
2025-01-15 15:15:35 +08:00
higher-performance
1594413d5e
Add Clang attribute to ensure that fields are initialized explicitly (#102040)
This is a new Clang-specific attribute to ensure that field
initializations are performed explicitly.

For example, if we have
```
struct B {
  [[clang::explicit]] int f1;
};
```
then the diagnostic would trigger if we do `B b{};`:
```
field 'f1' is left uninitialized, but was marked as requiring initialization
```

This prevents callers from accidentally forgetting to initialize fields,
particularly when new fields are added to the class.
2025-01-14 13:31:12 -05:00
Tom Honermann
1907a29ded
[Clang][NFC] Indentation fixes and unneeded semicolon removal. (#122794) 2025-01-13 16:24:46 -05:00
David Pagan
ad38e24eb7
[clang][OpenMP] Add 'align' modifier for 'allocate' clause (#121814)
The 'align' modifier is now accepted in the 'allocate' clause. Added LIT
tests covering codegen, PCH, template handling, and serialization for
'align' modifier.

Added support for align-modifier to release notes.

Testing
- New allocate modifier LIT tests.
- OpenMP LIT tests.
- check-all
2025-01-13 05:44:48 -08:00
Tom Honermann
8ea8e7f529
[SYCL] Basic diagnostics for the sycl_kernel_entry_point attribute. (#120327)
The `sycl_kernel_entry_point` attribute is used to declare a function that
defines a pattern for an offload kernel entry point. The attribute requires
a single type argument that specifies a class type that meets the requirements
for a SYCL kernel name as described in section 5.2, "Naming of kernels", of
the SYCL 2020 specification. A unique kernel name type is required for each
function declared with the attribute. The attribute may not first appear on a
declaration that follows a definition of the function. The function is
required to have a non-deduced `void` return type. The function must not be
a non-static member function, be deleted or defaulted, be declared with the
`constexpr` or `consteval` specifiers, be declared with the `[[noreturn]]`
attribute, be a coroutine, or accept variadic arguments.

Diagnostics are not yet provided for the following:
- Use of a type as a kernel name that does not satisfy the forward
  declarability requirements specified in section 5.2, "Naming of kernels",
  of the SYCL 2020 specification.
- Use of a type as a parameter of the attributed function that does not
  satisfy the kernel parameter requirements specified in section 4.12.4,
  "Rules for parameter passing to kernels", of the SYCL 2020 specification
  (each such function parameter constitutes a kernel parameter).
- Use of language features that are not permitted in device functions as
  specified in section 5.4, "Language restrictions for device functions",
  of the SYCL 2020 specification.

There are several issues noted by various FIXME comments.
- The diagnostic generated for kernel name conflicts needs additional work
  to better detail the relevant source locations; such as the location of
  each declaration as well as the original source of each kernel name.
- A number of the tests illustrate spurious errors being produced due to
  attributes that appertain to function templates being instantiated too
  early (during overload resolution as opposed to after an overload is
  selected).

Included changes allow the `SYCLKernelEntryPointAttr` attribute to be
marked as invalid if a `sycl_kernel_entry_point` attribute is used incorrectly.
This is intended to prevent trying to emit an offload kernel entry point
without having to mark the associated function as invalid since doing so
would affect overload resolution; which this attribute should not do.
Unfortunately, Clang eagerly instantiates attributes that appertain to
functions with the result that errors might be issued for function
declarations that are never selected by overload resolution. Tests have
been added to demonstrate this. Further work will be needed to address
these issues (for this and other attributes).
2025-01-09 15:42:29 -05:00
erichkeane
be32621ce8 [OpenACC] Implement 'device' and 'host' clauses for 'update'
These two clauses just take a 'var-list' and specify where the variables
should be copied from/to.  This patch implements the AST nodes for them
and ensures they properly take a var-list.
2025-01-09 09:28:58 -08:00
erichkeane
2c2accbcc6 [OpenACC] Enable 'self' sema for 'update' construct
The 'self' clause is an unfortunately difficult one, as it has a
significantly different meaning between 'update' and the other
constructs.  This patch introduces a way for the 'self' clause to work
as both.  I considered making this two separate AST nodes (one for
'self' on 'update' and one for the others), however this makes the
automated macros/etc for supporting a clause break.

Instead, 'self' has the ability to act as either a condition or as a
var-list clause.  As this is the only one of its kind, it is implemented
all within it.  If in the future we have more that work like this, we
should consider rewriting a lot of the macros that we use to make
clauses work, and make them separate ast nodes.
2025-01-08 13:19:33 -08:00
Younan Zhang
edf14ed6b1
[Clang] Don't form a type constraint if the concept is invalid (#122065)
After 0dedd6fe1 and 03229e7c0, invalid concept declarations might lack
expressions for evaluation and normalization. This could make it crash
in certain scenarios, apart from the one of evaluation concepts showed
in 03229e7c0, there's also an issue when checking specializations where
the normalization also relies on a non-null expression.

This patch prevents that by avoiding building up a type constraint in
such situations, thereafter the template parameter wouldn't have a
concept specialization of a null expression.

With this patch, the assumption in ASTWriterDecl is no longer valid.
Namely, HasConstraint and TypeConstraintInitialized must now represent
different meanings for both source fidelity and semantic requirements.

Fixes https://github.com/llvm/llvm-project/issues/115004
Fixes https://github.com/llvm/llvm-project/issues/121980
2025-01-08 19:40:16 +08:00
erichkeane
db81e8c42e [OpenACC] Initial sema implementation of 'update' construct
This executable construct has a larger list of clauses than some of the
others, plus has some additional restrictions.  This patch implements
the AST node, plus the 'cannot be the body of a if, while, do, switch,
    or label' statement restriction.  Future patches will handle the
    rest of the restrictions, which are based on clauses.
2025-01-07 08:20:20 -08:00
erichkeane
ff24e9a19e [OpenACC] Implement 'default_async' sema
A fairly simple one, only valid on the 'set' construct, this clause
takes an int expression.  Most of the work was already done as a part of
parsing, so this patch ends up being a lot of infrastructure.
2025-01-06 11:03:18 -08:00
erichkeane
21c785d7bd [OpenACC] Implement 'set' construct sema
The 'set' construct is another fairly simple one, it doesn't have an
associated statement and only a handful of allowed clauses. This patch
implements it and all the rules for it, allowing 3 of its for clauses.
The only exception is default_async, which will be implemented in a
future patch, because it isn't just being enabled, it needs a complete
new implementation.
2025-01-06 11:03:18 -08:00
Alejandro Álvarez Ayllón
a13bcf3ced
[clang] Do not serialize function definitions without a body (#121550)
An instantiated templated function definition may not have a body due to
parsing errors inside the templated function. When serializing, an
assert is triggered inside `ASTRecordWriter::AddFunctionDefinition`.

The instantiation may happen on an intermediate module.

The test case was reduced from `mp-units`.
2025-01-06 18:52:11 +08:00
Alejandro Álvarez Ayllón
c1ecc0d168
[clang] Allow generating module interfaces with parsing errors (#121485)
Fixes a regression introduced in commit
da00c60dae0040185dc45039c4397f6e746548e9

This functionality was originally added in commit
5834996fefc937d6211dc8c8a5b200068753391a

Co-authored-by: Tomasz Kaminski <tomasz.kaminski@sonarsource.com>
2025-01-03 09:43:53 +08:00
Chuanqi Xu
4b35dd57b8 [Serialization] Try to clean up PendingUndeducedFunctionDecls when
PendingUndeducedFunctionDecls is not empty

Close https://github.com/llvm/llvm-project/issues/120277

This turns out to be a simple oversight initially. See the analysis in
ba1e84fb8f
for the wider background.
2024-12-23 15:14:38 +08:00
Kazu Hirata
c9df8da659
[Serialization] Migrate away from PointerUnion::get (NFC) (#120844)
Note that PointerUnion::get has been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-21 17:40:39 -08:00
erichkeane
bdf2555308 [OpenACC] Implement 'device_num' clause sema for 'init'/'shutdown'
This is a very simple sema implementation, and just required AST node
plus the existing diagnostics.  This patch adds tests and adds the AST
node required, plus enables it for 'init' and 'shutdown' (only!)
2024-12-19 12:21:51 -08:00
erichkeane
4bbdb018a6 [OpenACC] Implement 'init' and 'shutdown' constructs
These two constructs are very simple and similar, and only support 3
different clauses, two of which are already implemented.  This patch
adds AST nodes for both constructs, and leaves the device_num clause
unimplemented, but enables the other two.
2024-12-19 12:21:50 -08:00
erichkeane
e34cc7c993 [OpenACC] Implement 'wait' construct
The arguments to this are the same as for the 'wait' clause, so this
reuses all of that infrastructure. So all this has to do is support a
pair of clauses that are already implemented (if and async), plus create
an AST node.  This patch does so, and adds proper testing.
2024-12-18 15:06:01 -08:00
erichkeane
fbb14dd977 [OpenACC] Implement 'use_device' clause AST/Sema
This is a clause that is only valid on 'host_data' constructs, and
identifies variables which it should use the current device address.
From a Sema perspective, the only thing novel here is mild changes to
how ActOnVar works for this clause, else this is very much like the rest
of the 'var-list' clauses.
2024-12-16 09:35:57 -08:00
erichkeane
1ab81f8e7f [OpenACC] Implement 'delete' AST/Sema for 'exit data' construct
'delete' is another clause that has very little compile-time
implication, but needs a full AST that takes a var list.  This patch
ipmlements it fully, plus adds sufficient test coverage.
2024-12-16 06:44:53 -08:00
Dmitry Polukhin
38b3d87bd3
[C++20][Modules] Load function body from the module that gives canonical decl (#111992)
Summary:
Fix crash from reproducer provided in
https://github.com/llvm/llvm-project/pull/109167#issuecomment-2405289565
Also fix issues with merged inline friend functions merged during deserialization.

Test Plan: check-clang
2024-12-16 12:22:43 +00:00
erichkeane
3351b3bf8d [OpenACC] implement 'detach' clause sema
This is another new clause specific to 'exit data' that takes a pointer
argument. This patch implements this the same way we do a few other
clauses (like attach) that have the same restrictions.
2024-12-13 13:51:41 -08:00
erichkeane
2244d2e75c [OpenACC] Implement 'if_present' clause sema
The 'if_present' clause controls the replacement of addresses in the
var-list in current device memory.  This clause can only go on
'host_device'.  From a Sema perspective, there isn't anything to do
beyond add this to AST and pass it on.
2024-12-13 13:04:57 -08:00
erichkeane
003eb5e80d [OpenACC] Implement 'finalize' clause sema
This is a very simple clause as far as sema is concerned.  It is only
valid on 'exit data', and doesn't have any rules involving it, so it is
simply applied and passed onto the MLIR.
2024-12-13 10:41:02 -08:00
erichkeane
010d0115fc [OpenACC] Create AST nodes for 'data' constructs
These constructs are all very similar and closely related, so this patch
creates the AST nodes for them, serialization, printing/etc.
Additionally the restrictions are all added as tests/todos in the tests,
as those will have to be implemented once we get those clauses implemented.
2024-12-12 07:28:30 -08:00
Chuanqi Xu
30ea0f0ce4 [NFC] [clang] [Serialization] Fix warning for narrowing cast 2024-12-11 11:28:04 +08:00
Chuanqi Xu
20e9049509
[Serialization] Support loading template specializations lazily (#119333)
Reland https://github.com/llvm/llvm-project/pull/83237

---

(Original comments)

Currently all the specializations of a template (including
instantiation, specialization and partial specializations) will be
loaded at once if we want to instantiate another instance for the
template, or find instantiation for the template, or just want to
complete the redecl chain.

This means basically we need to load every specializations for the
template once the template declaration got loaded. This is bad since
when we load a specialization, we need to load all of its template
arguments. Then we have to deserialize a lot of unnecessary
declarations.

For example,

```
// M.cppm
export module M;
export template <class T>
class A {};

export class ShouldNotBeLoaded {};

export class Temp {
   A<ShouldNotBeLoaded> AS;
};

// use.cpp
import M;
A<int> a;
```

We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we
instantiate the template `A` in `use.cpp`. Then we will deserialize
`ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this
patch tries to avoid that.

Given that the templates are heavily used in C++, this is a pain point
for the performance.

This patch adds MultiOnDiskHashTable for specializations in the
ASTReader. Then we will only deserialize the specializations with the
same template arguments. We made that by using ODRHash for the template
arguments as the key of the hash table.

To review this patch, I think `ASTReaderDecl::AddLazySpecializations`
may be a good entry point.
2024-12-11 09:40:47 +08:00
Kazu Hirata
83cb3dbc0c
[Serialization] Migrate away from PointerUnion::{is,get} (NFC) (#118948)
Note that PointerUnion::{is,get} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>

I'm not touching PointerUnion::dyn_cast for now because it's a bit
complicated; we could blindly migrate it to dyn_cast_if_present, but
we should probably use dyn_cast when the operand is known to be
non-null.
2024-12-09 09:47:38 -08:00
Haowei Wu
12bdeba76e Revert "[Serialization] Support load lazy specialization lazily"
This reverts commit b5bd19211118c6d43bc525a4e3fb65d2c750d61e.
It brokes multiple llvm bots including clang-x64-windows-msvc
2024-12-06 10:33:57 -08:00
Chuanqi Xu
b5bd192111 [Serialization] Support load lazy specialization lazily
Currently all the specializations of a template (including
instantiation, specialization and partial specializations)  will be
loaded at once if we want to instantiate another instance for the
template, or find instantiation for the template, or just want to
complete the redecl chain.

This means basically we need to load every specializations for the
template once the template declaration got loaded. This is bad since
when we load a specialization, we need to load all of its template
arguments. Then we have to deserialize a lot of unnecessary
declarations.

For example,

```
// M.cppm
export module M;
export template <class T>
class A {};

export class ShouldNotBeLoaded {};

export class Temp {
   A<ShouldNotBeLoaded> AS;
};

// use.cpp
import M;
A<int> a;
```

We should a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we
instantiate the template `A` in `use.cpp`. Then we will deserialize
`ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this
patch tries to avoid that.

Given that the templates are heavily used in C++, this is a pain point
for the performance.

This patch adds MultiOnDiskHashTable for specializations in the
ASTReader. Then we will only deserialize the specializations with the
same template arguments. We made that by using ODRHash for the template
arguments as the key of the hash table.

To review this patch, I think `ASTReaderDecl::AddLazySpecializations`
may be a good entry point.

The patch was reviewed in
https://github.com/llvm/llvm-project/pull/83237 but that PR is a stacked
PR. But I feel the intention of the stacked PRs get lost during the
review process. So I feel it is better to merge the commits into a
single commit instead of merging them in the PR page. It is better for
us to cherry-pick and revert.
2024-12-06 10:52:35 +08:00
Ilya Biryukov
f1d81dbd05
[ASTWriter] Do not allocate source location space for module maps used only for textual headers (#116374)
This is a follow up to #112015 and it reduces the unnecessary
duplication of source locations further.

We do not need to allocate source location space in the serialized PCMs
for module maps used only to find textual headers. Those module maps are
never referenced from anywhere in the serialized ASTs and are re-read in
other compilations.
This change should not affect correctness of Clang compilations or
clang-scan-deps in any way.

We do need the InputFile entry in the serialized AST because
clang-scan-deps relies on it. The previous patch introduced a mechanism
to do exactly that.

We have found that to finally remove any duplication of module maps we
use internally in our build system.
2024-12-05 15:08:38 +01:00
Chuanqi Xu
99de065b85 Revert "[Serialization] Downgrade inconsistent flags from erros to warnings (#115416)"
This reverts commit 74449ab86b8bc8d7388ede0cc7fc3a679da0c567.

See the post commit message in
https://github.com/llvm/llvm-project/pull/115416
2024-11-27 11:35:49 +08:00
Chuanqi Xu
74449ab86b
[Serialization] Downgrade inconsistent flags from erros to warnings (#115416)
There were many many "voices" about the too strict flags checking in
modules. Although they rarely challenge this, maybe due to they respect
to the compiler implementation details. But from my point of view, there
are cases it is "fine" to have different flags. Especially we're too
conservative to mark almost language options in
`clang/include/clang/Basic/LangOptions.def` as incompatible options (see
the comments in the front of the file).

In my understanding, this should come from PCH initially since it is
natural to ask your headers to be compiled with the same flags with your
TU. And then, when Apple and Google goes to implement clang module, they
don't challenge it too since they have a closed world where they have a
strong control over the ecosystem so that they can make it consistent.

Yes, consistency is great and ODR violation are awful. But this is the
world we're living today. This is the C++'s ecosystem in the open ended
world. Image a situation that we're using a third party module and we
add a new option to our library, then the build bails out! THIS IS SUPER
ANNOYING. And makes it non practical to make a modular C++ ecosystem.

(
This was discussed many times in SG15. And the consensus is, the build
systems should generate different BMI based on different flags. But this
manner can't avoid ODR violation completely and it would add the times
of module files that need to be built, which may kill the benefit of
faster compilation of modules.

However, I think the build systems may need to do the similar things in
the end of the day. Considering libc++'s hardening mechanism
(https://libcxx.llvm.org/Hardening.html). So the conclusion of the
paragraph is, although this seems related to build systems, I think they
are actually unrelated story.
)

I think we should give our users a chance to disable such checks. It is
theoretically unsafe. But we've done our job to tell the users that it
**MAY** be bad. Then I feel it is C++-ish to give users more freedom
even if they may shoot their foot.

This shouldn't change any thing. Users who want previous behavior can
get it easily by `-Werror=`.
2024-11-27 10:53:03 +08:00
Younan Zhang
df335b09ea
[Clang] Preserve partially substituted pack indexing type/expressions (#116782)
Substituting into pack indexing types/expressions can still result in
unexpanded types/expressions, such as `PackIndexingType` or
`PackIndexingExpr`. To handle these cases correctly, we should defer the
pack size checks to the next round of transformation, when the patterns
can be fully expanded.

To that end, the `FullySubstituted` flag is now necessary for computing
the dependencies of `PackIndexingExprs`. Conveniently, this flag can
also represent the prior `ExpandsToEmpty` status with an additional
emptiness check. Therefore, I converted all stored flags to use
`FullySubstituted`.

Fixes https://github.com/llvm/llvm-project/issues/116105
2024-11-25 16:16:39 +08:00
Jan Svoboda
b769e3544a
[clang][serialization] Blobify IMPORTS strings and signatures (#116095)
This PR changes a part of the PCM format to store string-like things in
the blob attached to a record instead of VBR6-encoding them into the
record itself. Applied to the `IMPORTS` section (which is very hot),
this speeds up dependency scanning by 2.8%.
2024-11-18 11:45:41 -08:00
Kadir Cetinkaya
5845688e91
Reapply "[clang] Introduce diagnostics suppression mappings (#112517)"
This reverts commit 5f140ba54794fe6ca379362b133eb27780e363d7.
2024-11-13 10:35:22 +01:00
Balázs Kéri
7a1fdbb9c0
[clang][AST] Add 'IgnoreTemplateParmDepth' to structural equivalence cache (#115518)
Structural equivalence check uses a cache to store already found
non-equivalent values. This cache can be reused for calls (ASTImporter
does this). Value of "IgnoreTemplateParmDepth" can have an effect on the
structural equivalence therefore it is wrong to reuse the same cache for
checks with different values of 'IgnoreTemplateParmDepth'. The current
change adds the 'IgnoreTemplateParmDepth' to the cache key to fix the
problem.
2024-11-13 09:25:22 +01:00
Kadir Cetinkaya
5f140ba547
Revert "[clang] Introduce diagnostics suppression mappings (#112517)"
This reverts commit 12e3ed8de8c6063b15916b3faf67c8c9cd17df1f.
This reverts commit 41e3919ded78d8870f7c95e9181c7f7e29aa3cc4.

There are some buildbot breakages in
https://lab.llvm.org/buildbot/#/builders/18/builds/6832.
2024-11-12 18:30:42 +01:00