llvm-project/clang/lib/Serialization/ASTWriterDecl.cpp

Ignoring revisions in .git-blame-ignore-revs. Click here to bypass and see the normal blame view.

2977 lines
117 KiB
C++
Raw Normal View History

//===--- ASTWriterDecl.cpp - Declaration Serialization --------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file implements serialization for Declarations.
//
//===----------------------------------------------------------------------===//
#include "ASTCommon.h"
#include "clang/AST/Attr.h"
#include "clang/AST/DeclCXX.h"
#include "clang/AST/DeclTemplate.h"
#include "clang/AST/DeclVisitor.h"
#include "clang/AST/Expr.h"
#include "clang/AST/OpenMPClause.h"
#include "clang/AST/PrettyDeclStackTrace.h"
#include "clang/Basic/SourceManager.h"
#include "clang/Serialization/ASTReader.h"
#include "clang/Serialization/ASTRecordWriter.h"
#include "llvm/Bitstream/BitstreamWriter.h"
#include "llvm/Support/ErrorHandling.h"
#include <optional>
using namespace clang;
using namespace serialization;
//===----------------------------------------------------------------------===//
// Declaration serialization
//===----------------------------------------------------------------------===//
namespace clang {
class ASTDeclWriter : public DeclVisitor<ASTDeclWriter, void> {
ASTWriter &Writer;
ASTRecordWriter Record;
serialization::DeclCode Code;
unsigned AbbrevToUse;
bool GeneratingReducedBMI = false;
public:
ASTDeclWriter(ASTWriter &Writer, ASTContext &Context,
ASTWriter::RecordDataImpl &Record, bool GeneratingReducedBMI)
: Writer(Writer), Record(Context, Writer, Record),
Code((serialization::DeclCode)0), AbbrevToUse(0),
GeneratingReducedBMI(GeneratingReducedBMI) {}
uint64_t Emit(Decl *D) {
if (!Code)
llvm::report_fatal_error(StringRef("unexpected declaration kind '") +
D->getDeclKindName() + "'");
return Record.Emit(Code, AbbrevToUse);
}
void Visit(Decl *D);
void VisitDecl(Decl *D);
void VisitPragmaCommentDecl(PragmaCommentDecl *D);
void VisitPragmaDetectMismatchDecl(PragmaDetectMismatchDecl *D);
void VisitTranslationUnitDecl(TranslationUnitDecl *D);
void VisitNamedDecl(NamedDecl *D);
void VisitLabelDecl(LabelDecl *LD);
void VisitNamespaceDecl(NamespaceDecl *D);
void VisitUsingDirectiveDecl(UsingDirectiveDecl *D);
void VisitNamespaceAliasDecl(NamespaceAliasDecl *D);
void VisitTypeDecl(TypeDecl *D);
void VisitTypedefNameDecl(TypedefNameDecl *D);
void VisitTypedefDecl(TypedefDecl *D);
void VisitTypeAliasDecl(TypeAliasDecl *D);
void VisitUnresolvedUsingTypenameDecl(UnresolvedUsingTypenameDecl *D);
void VisitUnresolvedUsingIfExistsDecl(UnresolvedUsingIfExistsDecl *D);
void VisitTagDecl(TagDecl *D);
void VisitEnumDecl(EnumDecl *D);
void VisitRecordDecl(RecordDecl *D);
void VisitCXXRecordDecl(CXXRecordDecl *D);
void VisitClassTemplateSpecializationDecl(
ClassTemplateSpecializationDecl *D);
void VisitClassTemplatePartialSpecializationDecl(
ClassTemplatePartialSpecializationDecl *D);
void VisitVarTemplateSpecializationDecl(VarTemplateSpecializationDecl *D);
void VisitVarTemplatePartialSpecializationDecl(
VarTemplatePartialSpecializationDecl *D);
void VisitTemplateTypeParmDecl(TemplateTypeParmDecl *D);
void VisitValueDecl(ValueDecl *D);
void VisitEnumConstantDecl(EnumConstantDecl *D);
void VisitUnresolvedUsingValueDecl(UnresolvedUsingValueDecl *D);
void VisitDeclaratorDecl(DeclaratorDecl *D);
void VisitFunctionDecl(FunctionDecl *D);
void VisitCXXDeductionGuideDecl(CXXDeductionGuideDecl *D);
void VisitCXXMethodDecl(CXXMethodDecl *D);
void VisitCXXConstructorDecl(CXXConstructorDecl *D);
void VisitCXXDestructorDecl(CXXDestructorDecl *D);
void VisitCXXConversionDecl(CXXConversionDecl *D);
void VisitFieldDecl(FieldDecl *D);
void VisitMSPropertyDecl(MSPropertyDecl *D);
Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented. Summary: Previously, we treated CXXUuidofExpr as quite a special case: it was the only kind of expression that could be a canonical template argument, it could be a constant lvalue base object, and so on. In addition, we represented the UUID value as a string, whose source form we did not preserve faithfully, and that we partially parsed in multiple different places. With this patch, we create an MSGuidDecl object to represent the implicit object of type 'struct _GUID' created by a UuidAttr. Each UuidAttr holds a pointer to its 'struct _GUID' and its original (as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue representation of the GUID on the MSGuidDecl and use it from constant evaluation where needed. This allows removing a lot of the special-case logic to handle these expressions. Unfortunately, many parts of Clang assume there are only a couple of interesting kinds of ValueDecl, so the total amount of special-case logic is not really reduced very much. This fixes a few bugs and issues: * PR38490: we now support reading from GUID objects returned from __uuidof during constant evaluation. * Our Itanium mangling for a non-instantiation-dependent template argument involving __uuidof no longer depends on which CXXUuidofExpr template argument we happened to see first. * We now predeclare ::_GUID, and permit use of __uuidof without any header inclusion, better matching MSVC's behavior. We do not predefine ::__s_GUID, though; that seems like a step too far. * Our IR representation for GUID constants now uses the correct IR type wherever possible. We will still fall back to using the {i32, i16, i16, [8 x i8]} layout if a definition of struct _GUID is not available. This is not ideal: in principle the two layouts could have different padding. Reviewers: rnk, jdoerfert Subscribers: arphaman, cfe-commits, aeubanks Tags: #clang Differential Revision: https://reviews.llvm.org/D78171
2020-04-11 22:15:29 -07:00
void VisitMSGuidDecl(MSGuidDecl *D);
void VisitUnnamedGlobalConstantDecl(UnnamedGlobalConstantDecl *D);
void VisitTemplateParamObjectDecl(TemplateParamObjectDecl *D);
void VisitIndirectFieldDecl(IndirectFieldDecl *D);
void VisitVarDecl(VarDecl *D);
void VisitImplicitParamDecl(ImplicitParamDecl *D);
void VisitParmVarDecl(ParmVarDecl *D);
void VisitDecompositionDecl(DecompositionDecl *D);
void VisitBindingDecl(BindingDecl *D);
void VisitNonTypeTemplateParmDecl(NonTypeTemplateParmDecl *D);
void VisitTemplateDecl(TemplateDecl *D);
void VisitConceptDecl(ConceptDecl *D);
void VisitImplicitConceptSpecializationDecl(
ImplicitConceptSpecializationDecl *D);
void VisitRequiresExprBodyDecl(RequiresExprBodyDecl *D);
void VisitRedeclarableTemplateDecl(RedeclarableTemplateDecl *D);
void VisitClassTemplateDecl(ClassTemplateDecl *D);
void VisitVarTemplateDecl(VarTemplateDecl *D);
void VisitFunctionTemplateDecl(FunctionTemplateDecl *D);
void VisitTemplateTemplateParmDecl(TemplateTemplateParmDecl *D);
void VisitTypeAliasTemplateDecl(TypeAliasTemplateDecl *D);
void VisitUsingDecl(UsingDecl *D);
void VisitUsingEnumDecl(UsingEnumDecl *D);
void VisitUsingPackDecl(UsingPackDecl *D);
void VisitUsingShadowDecl(UsingShadowDecl *D);
P0136R1, DR1573, DR1645, DR1715, DR1736, DR1903, DR1941, DR1959, DR1991: Replace inheriting constructors implementation with new approach, voted into C++ last year as a DR against C++11. Instead of synthesizing a set of derived class constructors for each inherited base class constructor, we make the constructors of the base class visible to constructor lookup in the derived class, using the normal rules for using-declarations. For constructors, UsingShadowDecl now has a ConstructorUsingShadowDecl derived class that tracks the requisite additional information. We create shadow constructors (not found by name lookup) in the derived class to model the actual initialization, and have a new expression node, CXXInheritedCtorInitExpr, to model the initialization of a base class from such a constructor. (This initialization is special because it performs real perfect forwarding of arguments.) In cases where argument forwarding is not possible (for inalloca calls, variadic calls, and calls with callee parameter cleanup), the shadow inheriting constructor is not emitted and instead we directly emit the initialization code into the caller of the inherited constructor. Note that this new model is not perfectly compatible with the old model in some corner cases. In particular: * if B inherits a private constructor from A, and C uses that constructor to construct a B, then we previously required that A befriends B and B befriends C, but the new rules require A to befriend C directly, and * if a derived class has its own constructors (and so its implicit default constructor is suppressed), it may still inherit a default constructor from a base class llvm-svn: 274049
2016-06-28 19:03:57 +00:00
void VisitConstructorUsingShadowDecl(ConstructorUsingShadowDecl *D);
void VisitLinkageSpecDecl(LinkageSpecDecl *D);
void VisitExportDecl(ExportDecl *D);
void VisitFileScopeAsmDecl(FileScopeAsmDecl *D);
[clang-repl] Support statements on global scope in incremental mode. This patch teaches clang to parse statements on the global scope to allow: ``` ./bin/clang-repl clang-repl> int i = 12; clang-repl> ++i; clang-repl> extern "C" int printf(const char*,...); clang-repl> printf("%d\n", i); 13 clang-repl> %quit ``` Generally, disambiguating between statements and declarations is a non-trivial task for a C++ parser. The challenge is to allow both standard C++ to be translated as if this patch does not exist and in the cases where the user typed a statement to be executed as if it were in a function body. Clang's Parser does pretty well in disambiguating between declarations and expressions. We have added DisambiguatingWithExpression flag which allows us to preserve the existing and optimized behavior where needed and implement the extra rules for disambiguating. Only few cases require additional attention: * Constructors/destructors -- Parser::isConstructorDeclarator was used in to disambiguate between ctor-looking declarations and statements on the global scope(eg. `Ns::f()`). * The template keyword -- the template keyword can appear in both declarations and statements. This patch considers the template keyword to be a declaration starter which breaks a few cases in incremental mode which will be tackled later. * The inline (and similar) keyword -- looking at the first token in many cases allows us to classify what is a declaration. * Other language keywords and specifiers -- ObjC/ObjC++/OpenCL/OpenMP rely on pragmas or special tokens which will be handled in subsequent patches. The patch conceptually models a "top-level" statement into a TopLevelStmtDecl. The TopLevelStmtDecl is lowered into a void function with no arguments. We attach this function to the global initializer list to execute the statement blocks in the correct order. Differential revision: https://reviews.llvm.org/D127284
2022-06-08 09:59:40 +00:00
void VisitTopLevelStmtDecl(TopLevelStmtDecl *D);
void VisitImportDecl(ImportDecl *D);
void VisitAccessSpecDecl(AccessSpecDecl *D);
void VisitFriendDecl(FriendDecl *D);
void VisitFriendTemplateDecl(FriendTemplateDecl *D);
void VisitStaticAssertDecl(StaticAssertDecl *D);
void VisitBlockDecl(BlockDecl *D);
void VisitCapturedDecl(CapturedDecl *D);
void VisitEmptyDecl(EmptyDecl *D);
void VisitLifetimeExtendedTemporaryDecl(LifetimeExtendedTemporaryDecl *D);
void VisitDeclContext(DeclContext *DC);
template <typename T> void VisitRedeclarable(Redeclarable<T> *D);
void VisitHLSLBufferDecl(HLSLBufferDecl *D);
// FIXME: Put in the same order is DeclNodes.td?
void VisitObjCMethodDecl(ObjCMethodDecl *D);
void VisitObjCTypeParamDecl(ObjCTypeParamDecl *D);
void VisitObjCContainerDecl(ObjCContainerDecl *D);
void VisitObjCInterfaceDecl(ObjCInterfaceDecl *D);
void VisitObjCIvarDecl(ObjCIvarDecl *D);
void VisitObjCProtocolDecl(ObjCProtocolDecl *D);
void VisitObjCAtDefsFieldDecl(ObjCAtDefsFieldDecl *D);
void VisitObjCCategoryDecl(ObjCCategoryDecl *D);
void VisitObjCImplDecl(ObjCImplDecl *D);
void VisitObjCCategoryImplDecl(ObjCCategoryImplDecl *D);
void VisitObjCImplementationDecl(ObjCImplementationDecl *D);
void VisitObjCCompatibleAliasDecl(ObjCCompatibleAliasDecl *D);
void VisitObjCPropertyDecl(ObjCPropertyDecl *D);
void VisitObjCPropertyImplDecl(ObjCPropertyImplDecl *D);
void VisitOMPThreadPrivateDecl(OMPThreadPrivateDecl *D);
void VisitOMPAllocateDecl(OMPAllocateDecl *D);
void VisitOMPRequiresDecl(OMPRequiresDecl *D);
void VisitOMPDeclareReductionDecl(OMPDeclareReductionDecl *D);
void VisitOMPDeclareMapperDecl(OMPDeclareMapperDecl *D);
void VisitOMPCapturedExprDecl(OMPCapturedExprDecl *D);
/// Add an Objective-C type parameter list to the given record.
void AddObjCTypeParamList(ObjCTypeParamList *typeParams) {
// Empty type parameter list.
if (!typeParams) {
Record.push_back(0);
return;
}
Record.push_back(typeParams->size());
for (auto *typeParam : *typeParams) {
Record.AddDeclRef(typeParam);
}
Record.AddSourceLocation(typeParams->getLAngleLoc());
Record.AddSourceLocation(typeParams->getRAngleLoc());
}
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
/// Collect the first declaration from each module file that provides a
/// declaration of D.
void CollectFirstDeclFromEachModule(
const Decl *D, bool IncludeLocal,
llvm::MapVector<ModuleFile *, const Decl *> &Firsts) {
// FIXME: We can skip entries that we know are implied by others.
for (const Decl *R = D->getMostRecentDecl(); R; R = R->getPreviousDecl()) {
if (R->isFromASTFile())
Firsts[Writer.Chain->getOwningModuleFile(R)] = R;
else if (IncludeLocal)
Firsts[nullptr] = R;
}
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
}
/// Add to the record the first declaration from each module file that
/// provides a declaration of D. The intent is to provide a sufficient
/// set such that reloading this set will load all current redeclarations.
void AddFirstDeclFromEachModule(const Decl *D, bool IncludeLocal) {
llvm::MapVector<ModuleFile *, const Decl *> Firsts;
CollectFirstDeclFromEachModule(D, IncludeLocal, Firsts);
for (const auto &F : Firsts)
Record.AddDeclRef(F.second);
}
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
/// Add to the record the first template specialization from each module
/// file that provides a declaration of D. We store the DeclId and an
/// ODRHash of the template arguments of D which should provide enough
/// information to load D only if the template instantiator needs it.
void AddFirstSpecializationDeclFromEachModule(
const Decl *D, llvm::SmallVectorImpl<const Decl *> &SpecsInMap,
llvm::SmallVectorImpl<const Decl *> &PartialSpecsInMap) {
assert((isa<ClassTemplateSpecializationDecl>(D) ||
isa<VarTemplateSpecializationDecl>(D) || isa<FunctionDecl>(D)) &&
"Must not be called with other decls");
llvm::MapVector<ModuleFile *, const Decl *> Firsts;
CollectFirstDeclFromEachModule(D, /*IncludeLocal*/ true, Firsts);
for (const auto &F : Firsts) {
if (isa<ClassTemplatePartialSpecializationDecl,
VarTemplatePartialSpecializationDecl>(F.second))
PartialSpecsInMap.push_back(F.second);
else
SpecsInMap.push_back(F.second);
}
}
/// Get the specialization decl from an entry in the specialization list.
template <typename EntryType>
typename RedeclarableTemplateDecl::SpecEntryTraits<EntryType>::DeclType *
getSpecializationDecl(EntryType &T) {
return RedeclarableTemplateDecl::SpecEntryTraits<EntryType>::getDecl(&T);
}
/// Get the list of partial specializations from a template's common ptr.
template<typename T>
decltype(T::PartialSpecializations) &getPartialSpecializations(T *Common) {
return Common->PartialSpecializations;
}
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
MutableArrayRef<FunctionTemplateSpecializationInfo>
getPartialSpecializations(FunctionTemplateDecl::Common *) {
return std::nullopt;
}
template<typename DeclTy>
void AddTemplateSpecializations(DeclTy *D) {
auto *Common = D->getCommonPtr();
// If we have any lazy specializations, and the external AST source is
// our chained AST reader, we can just write out the DeclIDs. Otherwise,
// we need to resolve them to actual declarations.
if (Writer.Chain != Record.getASTContext().getExternalSource() &&
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
Writer.Chain && Writer.Chain->haveUnloadedSpecializations(D)) {
D->LoadLazySpecializations();
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
assert(!Writer.Chain->haveUnloadedSpecializations(D));
}
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
// AddFirstSpecializationDeclFromEachModule might trigger deserialization,
// invalidating *Specializations iterators.
llvm::SmallVector<const Decl *, 16> AllSpecs;
for (auto &Entry : Common->Specializations)
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
AllSpecs.push_back(getSpecializationDecl(Entry));
for (auto &Entry : getPartialSpecializations(Common))
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
AllSpecs.push_back(getSpecializationDecl(Entry));
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
llvm::SmallVector<const Decl *, 16> Specs;
llvm::SmallVector<const Decl *, 16> PartialSpecs;
for (auto *D : AllSpecs) {
assert(D->isCanonicalDecl() && "non-canonical decl in set");
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
AddFirstSpecializationDeclFromEachModule(D, Specs, PartialSpecs);
}
Record.AddOffset(Writer.WriteSpecializationInfoLookupTable(
D, Specs, /*IsPartial=*/false));
// Function Template Decl doesn't have partial decls.
if (isa<FunctionTemplateDecl>(D)) {
assert(PartialSpecs.empty());
return;
}
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
Record.AddOffset(Writer.WriteSpecializationInfoLookupTable(
D, PartialSpecs, /*IsPartial=*/true));
}
/// Ensure that this template specialization is associated with the specified
/// template on reload.
void RegisterTemplateSpecialization(const Decl *Template,
const Decl *Specialization) {
Template = Template->getCanonicalDecl();
// If the canonical template is local, we'll write out this specialization
// when we emit it.
// FIXME: We can do the same thing if there is any local declaration of
// the template, to avoid emitting an update record.
if (!Template->isFromASTFile())
return;
// We only need to associate the first local declaration of the
// specialization. The other declarations will get pulled in by it.
if (Writer.getFirstLocalDecl(Specialization) != Specialization)
return;
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
if (isa<ClassTemplatePartialSpecializationDecl,
VarTemplatePartialSpecializationDecl>(Specialization))
Writer.PartialSpecializationsUpdates[cast<NamedDecl>(Template)]
.push_back(cast<NamedDecl>(Specialization));
else
Writer.SpecializationsUpdates[cast<NamedDecl>(Template)].push_back(
cast<NamedDecl>(Specialization));
}
};
}
bool clang::CanElideDeclDef(const Decl *D) {
if (auto *FD = dyn_cast<FunctionDecl>(D)) {
if (FD->isInlined() || FD->isConstexpr())
return false;
if (FD->isDependentContext())
return false;
if (FD->getTemplateSpecializationKind() == TSK_ImplicitInstantiation)
return false;
}
if (auto *VD = dyn_cast<VarDecl>(D)) {
if (!VD->getDeclContext()->getRedeclContext()->isFileContext() ||
VD->isInline() || VD->isConstexpr() || isa<ParmVarDecl>(VD) ||
// Constant initialized variable may not affect the ABI, but they
// may be used in constant evaluation in the frontend, so we have
// to remain them.
VD->hasConstantInitialization())
return false;
if (VD->getTemplateSpecializationKind() == TSK_ImplicitInstantiation)
return false;
}
return true;
}
void ASTDeclWriter::Visit(Decl *D) {
DeclVisitor<ASTDeclWriter>::Visit(D);
// Source locations require array (variable-length) abbreviations. The
// abbreviation infrastructure requires that arrays are encoded last, so
// we handle it here in the case of those classes derived from DeclaratorDecl
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
if (auto *DD = dyn_cast<DeclaratorDecl>(D)) {
if (auto *TInfo = DD->getTypeSourceInfo())
Record.AddTypeLoc(TInfo->getTypeLoc());
}
// Handle FunctionDecl's body here and write it after all other Stmts/Exprs
// have been written. We want it last because we will not read it back when
// retrieving it from the AST, we'll just lazily set the offset.
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
if (auto *FD = dyn_cast<FunctionDecl>(D)) {
if (!GeneratingReducedBMI || !CanElideDeclDef(FD)) {
Record.push_back(FD->doesThisDeclarationHaveABody());
if (FD->doesThisDeclarationHaveABody())
Record.AddFunctionDefinition(FD);
} else
Record.push_back(0);
}
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
// Similar to FunctionDecls, handle VarDecl's initializer here and write it
// after all other Stmts/Exprs. We will not read the initializer until after
// we have finished recursive deserialization, because it can recursively
// refer back to the variable.
if (auto *VD = dyn_cast<VarDecl>(D)) {
if (!GeneratingReducedBMI || !CanElideDeclDef(VD))
Record.AddVarDeclInit(VD);
else
Record.push_back(0);
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
}
// And similarly for FieldDecls. We already serialized whether there is a
// default member initializer.
if (auto *FD = dyn_cast<FieldDecl>(D)) {
if (FD->hasInClassInitializer()) {
if (Expr *Init = FD->getInClassInitializer()) {
Record.push_back(1);
Record.AddStmt(Init);
} else {
Record.push_back(0);
// Initializer has not been instantiated yet.
}
}
}
// If this declaration is also a DeclContext, write blocks for the
// declarations that lexically stored inside its context and those
// declarations that are visible from its context.
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
if (auto *DC = dyn_cast<DeclContext>(D))
VisitDeclContext(DC);
}
void ASTDeclWriter::VisitDecl(Decl *D) {
BitsPacker DeclBits;
// The order matters here. It will be better to put the bit with higher
// probability to be 0 in the end of the bits.
//
// Since we're using VBR6 format to store it.
// It will be pretty effient if all the higher bits are 0.
// For example, if we need to pack 8 bits into a value and the stored value
// is 0xf0, the actual stored value will be 0b000111'110000, which takes 12
// bits actually. However, if we changed the order to be 0x0f, then we can
// store it as 0b001111, which takes 6 bits only now.
DeclBits.addBits((uint64_t)D->getModuleOwnershipKind(), /*BitWidth=*/3);
DeclBits.addBit(D->isReferenced());
DeclBits.addBit(D->isUsed(false));
DeclBits.addBits(D->getAccess(), /*BitWidth=*/2);
DeclBits.addBit(D->isImplicit());
DeclBits.addBit(D->getDeclContext() != D->getLexicalDeclContext());
DeclBits.addBit(D->hasAttrs());
DeclBits.addBit(D->isTopLevelDeclInObjCContainer());
DeclBits.addBit(D->isInvalidDecl());
Record.push_back(DeclBits);
Record.AddDeclRef(cast_or_null<Decl>(D->getDeclContext()));
if (D->getDeclContext() != D->getLexicalDeclContext())
Record.AddDeclRef(cast_or_null<Decl>(D->getLexicalDeclContext()));
if (D->hasAttrs())
Record.AddAttributes(D->getAttrs());
Record.push_back(Writer.getSubmoduleID(D->getOwningModule()));
// If this declaration injected a name into a context different from its
// lexical context, and that context is an imported namespace, we need to
// update its visible declarations to include this name.
//
// This happens when we instantiate a class with a friend declaration or a
// function with a local extern declaration, for instance.
//
// FIXME: Can we handle this in AddedVisibleDecl instead?
if (D->isOutOfLine()) {
auto *DC = D->getDeclContext();
while (auto *NS = dyn_cast<NamespaceDecl>(DC->getRedeclContext())) {
if (!NS->isFromASTFile())
break;
Writer.UpdatedDeclContexts.insert(NS->getPrimaryContext());
if (!NS->isInlineNamespace())
break;
DC = NS->getParent();
}
}
}
void ASTDeclWriter::VisitPragmaCommentDecl(PragmaCommentDecl *D) {
StringRef Arg = D->getArg();
Record.push_back(Arg.size());
VisitDecl(D);
Record.AddSourceLocation(D->getBeginLoc());
Record.push_back(D->getCommentKind());
Record.AddString(Arg);
Code = serialization::DECL_PRAGMA_COMMENT;
}
void ASTDeclWriter::VisitPragmaDetectMismatchDecl(
PragmaDetectMismatchDecl *D) {
StringRef Name = D->getName();
StringRef Value = D->getValue();
Record.push_back(Name.size() + 1 + Value.size());
VisitDecl(D);
Record.AddSourceLocation(D->getBeginLoc());
Record.AddString(Name);
Record.AddString(Value);
Code = serialization::DECL_PRAGMA_DETECT_MISMATCH;
}
void ASTDeclWriter::VisitTranslationUnitDecl(TranslationUnitDecl *D) {
llvm_unreachable("Translation units aren't directly serialized");
}
void ASTDeclWriter::VisitNamedDecl(NamedDecl *D) {
VisitDecl(D);
Record.AddDeclarationName(D->getDeclName());
Record.push_back(needsAnonymousDeclarationNumber(D)
? Writer.getAnonymousDeclarationNumber(D)
: 0);
}
void ASTDeclWriter::VisitTypeDecl(TypeDecl *D) {
VisitNamedDecl(D);
Record.AddSourceLocation(D->getBeginLoc());
Record.AddTypeRef(QualType(D->getTypeForDecl(), 0));
}
void ASTDeclWriter::VisitTypedefNameDecl(TypedefNameDecl *D) {
VisitRedeclarable(D);
VisitTypeDecl(D);
Record.AddTypeSourceInfo(D->getTypeSourceInfo());
Record.push_back(D->isModed());
if (D->isModed())
Record.AddTypeRef(D->getUnderlyingType());
Record.AddDeclRef(D->getAnonDeclWithTypedefName(false));
}
void ASTDeclWriter::VisitTypedefDecl(TypedefDecl *D) {
VisitTypedefNameDecl(D);
if (D->getDeclContext() == D->getLexicalDeclContext() &&
!D->hasAttrs() &&
!D->isImplicit() &&
D->getFirstDecl() == D->getMostRecentDecl() &&
!D->isInvalidDecl() &&
!D->isTopLevelDeclInObjCContainer() &&
!D->isModulePrivate() &&
!needsAnonymousDeclarationNumber(D) &&
D->getDeclName().getNameKind() == DeclarationName::Identifier)
AbbrevToUse = Writer.getDeclTypedefAbbrev();
Code = serialization::DECL_TYPEDEF;
}
void ASTDeclWriter::VisitTypeAliasDecl(TypeAliasDecl *D) {
VisitTypedefNameDecl(D);
Record.AddDeclRef(D->getDescribedAliasTemplate());
Code = serialization::DECL_TYPEALIAS;
}
void ASTDeclWriter::VisitTagDecl(TagDecl *D) {
static_assert(DeclContext::NumTagDeclBits == 23,
"You need to update the serializer after you change the "
"TagDeclBits");
VisitRedeclarable(D);
VisitTypeDecl(D);
Record.push_back(D->getIdentifierNamespace());
BitsPacker TagDeclBits;
TagDeclBits.addBits(llvm::to_underlying(D->getTagKind()), /*BitWidth=*/3);
TagDeclBits.addBit(!isa<CXXRecordDecl>(D) ? D->isCompleteDefinition() : 0);
TagDeclBits.addBit(D->isEmbeddedInDeclarator());
TagDeclBits.addBit(D->isFreeStanding());
TagDeclBits.addBit(D->isCompleteDefinitionRequired());
TagDeclBits.addBits(
D->hasExtInfo() ? 1 : (D->getTypedefNameForAnonDecl() ? 2 : 0),
/*BitWidth=*/2);
Record.push_back(TagDeclBits);
Record.AddSourceRange(D->getBraceRange());
if (D->hasExtInfo()) {
Record.AddQualifierInfo(*D->getExtInfo());
} else if (auto *TD = D->getTypedefNameForAnonDecl()) {
Record.AddDeclRef(TD);
Record.AddIdentifierRef(TD->getDeclName().getAsIdentifierInfo());
}
}
void ASTDeclWriter::VisitEnumDecl(EnumDecl *D) {
static_assert(DeclContext::NumEnumDeclBits == 43,
"You need to update the serializer after you change the "
"EnumDeclBits");
VisitTagDecl(D);
Record.AddTypeSourceInfo(D->getIntegerTypeSourceInfo());
if (!D->getIntegerTypeSourceInfo())
Record.AddTypeRef(D->getIntegerType());
Record.AddTypeRef(D->getPromotionType());
BitsPacker EnumDeclBits;
EnumDeclBits.addBits(D->getNumPositiveBits(), /*BitWidth=*/8);
EnumDeclBits.addBits(D->getNumNegativeBits(), /*BitWidth=*/8);
EnumDeclBits.addBit(D->isScoped());
EnumDeclBits.addBit(D->isScopedUsingClassTag());
EnumDeclBits.addBit(D->isFixed());
Record.push_back(EnumDeclBits);
Record.push_back(D->getODRHash());
if (MemberSpecializationInfo *MemberInfo = D->getMemberSpecializationInfo()) {
Record.AddDeclRef(MemberInfo->getInstantiatedFrom());
Record.push_back(MemberInfo->getTemplateSpecializationKind());
Record.AddSourceLocation(MemberInfo->getPointOfInstantiation());
} else {
Record.AddDeclRef(nullptr);
}
if (D->getDeclContext() == D->getLexicalDeclContext() && !D->hasAttrs() &&
!D->isInvalidDecl() && !D->isImplicit() && !D->hasExtInfo() &&
!D->getTypedefNameForAnonDecl() &&
D->getFirstDecl() == D->getMostRecentDecl() &&
!D->isTopLevelDeclInObjCContainer() &&
!CXXRecordDecl::classofKind(D->getKind()) &&
!D->getIntegerTypeSourceInfo() && !D->getMemberSpecializationInfo() &&
!needsAnonymousDeclarationNumber(D) &&
D->getDeclName().getNameKind() == DeclarationName::Identifier)
AbbrevToUse = Writer.getDeclEnumAbbrev();
Code = serialization::DECL_ENUM;
}
void ASTDeclWriter::VisitRecordDecl(RecordDecl *D) {
static_assert(DeclContext::NumRecordDeclBits == 64,
"You need to update the serializer after you change the "
"RecordDeclBits");
VisitTagDecl(D);
BitsPacker RecordDeclBits;
RecordDeclBits.addBit(D->hasFlexibleArrayMember());
RecordDeclBits.addBit(D->isAnonymousStructOrUnion());
RecordDeclBits.addBit(D->hasObjectMember());
RecordDeclBits.addBit(D->hasVolatileMember());
RecordDeclBits.addBit(D->isNonTrivialToPrimitiveDefaultInitialize());
RecordDeclBits.addBit(D->isNonTrivialToPrimitiveCopy());
RecordDeclBits.addBit(D->isNonTrivialToPrimitiveDestroy());
RecordDeclBits.addBit(D->hasNonTrivialToPrimitiveDefaultInitializeCUnion());
RecordDeclBits.addBit(D->hasNonTrivialToPrimitiveDestructCUnion());
RecordDeclBits.addBit(D->hasNonTrivialToPrimitiveCopyCUnion());
RecordDeclBits.addBit(D->hasUninitializedExplicitInitFields());
RecordDeclBits.addBit(D->isParamDestroyedInCallee());
RecordDeclBits.addBits(llvm::to_underlying(D->getArgPassingRestrictions()), 2);
Record.push_back(RecordDeclBits);
// Only compute this for C/Objective-C, in C++ this is computed as part
// of CXXRecordDecl.
if (!isa<CXXRecordDecl>(D))
Record.push_back(D->getODRHash());
if (D->getDeclContext() == D->getLexicalDeclContext() && !D->hasAttrs() &&
!D->isImplicit() && !D->isInvalidDecl() && !D->hasExtInfo() &&
!D->getTypedefNameForAnonDecl() &&
D->getFirstDecl() == D->getMostRecentDecl() &&
!D->isTopLevelDeclInObjCContainer() &&
!CXXRecordDecl::classofKind(D->getKind()) &&
!needsAnonymousDeclarationNumber(D) &&
D->getDeclName().getNameKind() == DeclarationName::Identifier)
AbbrevToUse = Writer.getDeclRecordAbbrev();
Code = serialization::DECL_RECORD;
}
void ASTDeclWriter::VisitValueDecl(ValueDecl *D) {
VisitNamedDecl(D);
Record.AddTypeRef(D->getType());
}
void ASTDeclWriter::VisitEnumConstantDecl(EnumConstantDecl *D) {
VisitValueDecl(D);
Record.push_back(D->getInitExpr()? 1 : 0);
if (D->getInitExpr())
Record.AddStmt(D->getInitExpr());
Record.AddAPSInt(D->getInitVal());
Code = serialization::DECL_ENUM_CONSTANT;
}
void ASTDeclWriter::VisitDeclaratorDecl(DeclaratorDecl *D) {
VisitValueDecl(D);
Record.AddSourceLocation(D->getInnerLocStart());
Record.push_back(D->hasExtInfo());
if (D->hasExtInfo()) {
DeclaratorDecl::ExtInfo *Info = D->getExtInfo();
Record.AddQualifierInfo(*Info);
Record.AddStmt(Info->TrailingRequiresClause);
}
// The location information is deferred until the end of the record.
Record.AddTypeRef(D->getTypeSourceInfo() ? D->getTypeSourceInfo()->getType()
: QualType());
}
void ASTDeclWriter::VisitFunctionDecl(FunctionDecl *D) {
static_assert(DeclContext::NumFunctionDeclBits == 44,
"You need to update the serializer after you change the "
"FunctionDeclBits");
VisitRedeclarable(D);
Record.push_back(D->getTemplatedKind());
switch (D->getTemplatedKind()) {
case FunctionDecl::TK_NonTemplate:
break;
case FunctionDecl::TK_DependentNonTemplate:
Record.AddDeclRef(D->getInstantiatedFromDecl());
break;
case FunctionDecl::TK_FunctionTemplate:
Record.AddDeclRef(D->getDescribedFunctionTemplate());
break;
case FunctionDecl::TK_MemberSpecialization: {
MemberSpecializationInfo *MemberInfo = D->getMemberSpecializationInfo();
Record.AddDeclRef(MemberInfo->getInstantiatedFrom());
Record.push_back(MemberInfo->getTemplateSpecializationKind());
Record.AddSourceLocation(MemberInfo->getPointOfInstantiation());
break;
}
case FunctionDecl::TK_FunctionTemplateSpecialization: {
FunctionTemplateSpecializationInfo *
FTSInfo = D->getTemplateSpecializationInfo();
RegisterTemplateSpecialization(FTSInfo->getTemplate(), D);
Record.AddDeclRef(FTSInfo->getTemplate());
Record.push_back(FTSInfo->getTemplateSpecializationKind());
// Template arguments.
Record.AddTemplateArgumentList(FTSInfo->TemplateArguments);
// Template args as written.
Record.push_back(FTSInfo->TemplateArgumentsAsWritten != nullptr);
if (FTSInfo->TemplateArgumentsAsWritten)
Record.AddASTTemplateArgumentListInfo(
FTSInfo->TemplateArgumentsAsWritten);
Record.AddSourceLocation(FTSInfo->getPointOfInstantiation());
if (MemberSpecializationInfo *MemberInfo =
FTSInfo->getMemberSpecializationInfo()) {
Record.push_back(1);
Record.AddDeclRef(MemberInfo->getInstantiatedFrom());
Record.push_back(MemberInfo->getTemplateSpecializationKind());
Record.AddSourceLocation(MemberInfo->getPointOfInstantiation());
} else {
Record.push_back(0);
}
if (D->isCanonicalDecl()) {
// Write the template that contains the specializations set. We will
// add a FunctionTemplateSpecializationInfo to it when reading.
Record.AddDeclRef(FTSInfo->getTemplate()->getCanonicalDecl());
}
break;
}
case FunctionDecl::TK_DependentFunctionTemplateSpecialization: {
DependentFunctionTemplateSpecializationInfo *
DFTSInfo = D->getDependentSpecializationInfo();
// Candidates.
Record.push_back(DFTSInfo->getCandidates().size());
for (FunctionTemplateDecl *FTD : DFTSInfo->getCandidates())
Record.AddDeclRef(FTD);
// Templates args.
Record.push_back(DFTSInfo->TemplateArgumentsAsWritten != nullptr);
if (DFTSInfo->TemplateArgumentsAsWritten)
Record.AddASTTemplateArgumentListInfo(
DFTSInfo->TemplateArgumentsAsWritten);
break;
}
}
VisitDeclaratorDecl(D);
Record.AddDeclarationNameLoc(D->DNLoc, D->getDeclName());
Record.push_back(D->getIdentifierNamespace());
// The order matters here. It will be better to put the bit with higher
// probability to be 0 in the end of the bits. See the comments in VisitDecl
// for details.
BitsPacker FunctionDeclBits;
// FIXME: stable encoding
FunctionDeclBits.addBits(llvm::to_underlying(D->getLinkageInternal()), 3);
FunctionDeclBits.addBits((uint32_t)D->getStorageClass(), /*BitWidth=*/3);
FunctionDeclBits.addBit(D->isInlineSpecified());
FunctionDeclBits.addBit(D->isInlined());
FunctionDeclBits.addBit(D->hasSkippedBody());
FunctionDeclBits.addBit(D->isVirtualAsWritten());
FunctionDeclBits.addBit(D->isPureVirtual());
FunctionDeclBits.addBit(D->hasInheritedPrototype());
FunctionDeclBits.addBit(D->hasWrittenPrototype());
FunctionDeclBits.addBit(D->isDeletedBit());
FunctionDeclBits.addBit(D->isTrivial());
FunctionDeclBits.addBit(D->isTrivialForCall());
FunctionDeclBits.addBit(D->isDefaulted());
FunctionDeclBits.addBit(D->isExplicitlyDefaulted());
FunctionDeclBits.addBit(D->isIneligibleOrNotSelected());
FunctionDeclBits.addBits((uint64_t)(D->getConstexprKind()), /*BitWidth=*/2);
FunctionDeclBits.addBit(D->hasImplicitReturnZero());
FunctionDeclBits.addBit(D->isMultiVersion());
FunctionDeclBits.addBit(D->isLateTemplateParsed());
FunctionDeclBits.addBit(D->FriendConstraintRefersToEnclosingTemplate());
FunctionDeclBits.addBit(D->usesSEHTry());
Record.push_back(FunctionDeclBits);
Record.AddSourceLocation(D->getEndLoc());
if (D->isExplicitlyDefaulted())
Record.AddSourceLocation(D->getDefaultLoc());
Record.push_back(D->getODRHash());
if (D->isDefaulted() || D->isDeletedAsWritten()) {
if (auto *FDI = D->getDefalutedOrDeletedInfo()) {
// Store both that there is an DefaultedOrDeletedInfo and whether it
// contains a DeletedMessage.
StringLiteral *DeletedMessage = FDI->getDeletedMessage();
Record.push_back(1 | (DeletedMessage ? 2 : 0));
if (DeletedMessage)
Record.AddStmt(DeletedMessage);
Record.push_back(FDI->getUnqualifiedLookups().size());
for (DeclAccessPair P : FDI->getUnqualifiedLookups()) {
Record.AddDeclRef(P.getDecl());
Record.push_back(P.getAccess());
}
} else {
Record.push_back(0);
}
}
if (D->getFriendObjectKind()) {
// For a function defined inline within a class template, we have to force
// the canonical definition to be the one inside the canonical definition of
// the template. Remember this relation to deserialize them together.
if (auto *RD = dyn_cast<CXXRecordDecl>(D->getLexicalParent()))
if (RD->isDependentContext() && RD->isThisDeclarationADefinition()) {
Writer.RelatedDeclsMap[Writer.GetDeclRef(RD)].push_back(
Writer.GetDeclRef(D));
}
}
Record.push_back(D->param_size());
for (auto *P : D->parameters())
Record.AddDeclRef(P);
Code = serialization::DECL_FUNCTION;
}
static void addExplicitSpecifier(ExplicitSpecifier ES,
ASTRecordWriter &Record) {
uint64_t Kind = static_cast<uint64_t>(ES.getKind());
Kind = Kind << 1 | static_cast<bool>(ES.getExpr());
Record.push_back(Kind);
if (ES.getExpr()) {
Record.AddStmt(ES.getExpr());
}
}
void ASTDeclWriter::VisitCXXDeductionGuideDecl(CXXDeductionGuideDecl *D) {
addExplicitSpecifier(D->getExplicitSpecifier(), Record);
Record.AddDeclRef(D->Ctor);
VisitFunctionDecl(D);
Record.push_back(static_cast<unsigned char>(D->getDeductionCandidateKind()));
Code = serialization::DECL_CXX_DEDUCTION_GUIDE;
}
void ASTDeclWriter::VisitObjCMethodDecl(ObjCMethodDecl *D) {
static_assert(DeclContext::NumObjCMethodDeclBits == 37,
"You need to update the serializer after you change the "
"ObjCMethodDeclBits");
VisitNamedDecl(D);
// FIXME: convert to LazyStmtPtr?
// Unlike C/C++, method bodies will never be in header files.
bool HasBodyStuff = D->getBody() != nullptr;
Record.push_back(HasBodyStuff);
if (HasBodyStuff) {
Record.AddStmt(D->getBody());
}
Record.AddDeclRef(D->getSelfDecl());
Record.AddDeclRef(D->getCmdDecl());
Record.push_back(D->isInstanceMethod());
Record.push_back(D->isVariadic());
Record.push_back(D->isPropertyAccessor());
Record.push_back(D->isSynthesizedAccessorStub());
Record.push_back(D->isDefined());
Record.push_back(D->isOverriding());
Record.push_back(D->hasSkippedBody());
Record.push_back(D->isRedeclaration());
Record.push_back(D->hasRedeclaration());
if (D->hasRedeclaration()) {
assert(Record.getASTContext().getObjCMethodRedeclaration(D));
Record.AddDeclRef(Record.getASTContext().getObjCMethodRedeclaration(D));
}
// FIXME: stable encoding for @required/@optional
Record.push_back(llvm::to_underlying(D->getImplementationControl()));
// FIXME: stable encoding for in/out/inout/bycopy/byref/oneway/nullability
Record.push_back(D->getObjCDeclQualifier());
Record.push_back(D->hasRelatedResultType());
Record.AddTypeRef(D->getReturnType());
Record.AddTypeSourceInfo(D->getReturnTypeSourceInfo());
Record.AddSourceLocation(D->getEndLoc());
Record.push_back(D->param_size());
for (const auto *P : D->parameters())
Record.AddDeclRef(P);
Record.push_back(D->getSelLocsKind());
unsigned NumStoredSelLocs = D->getNumStoredSelLocs();
SourceLocation *SelLocs = D->getStoredSelLocs();
Record.push_back(NumStoredSelLocs);
for (unsigned i = 0; i != NumStoredSelLocs; ++i)
Record.AddSourceLocation(SelLocs[i]);
Code = serialization::DECL_OBJC_METHOD;
}
void ASTDeclWriter::VisitObjCTypeParamDecl(ObjCTypeParamDecl *D) {
VisitTypedefNameDecl(D);
Record.push_back(D->Variance);
Substitute type arguments into uses of Objective-C interface members. When messaging a method that was defined in an Objective-C class (or category or extension thereof) that has type parameters, substitute the type arguments for those type parameters. Similarly, substitute into property accesses, instance variables, and other references. This includes general infrastructure for substituting the type arguments associated with an ObjCObject(Pointer)Type into a type referenced within a particular context, handling all of the substitutions required to deal with (e.g.) inheritance involving parameterized classes. In cases where no type arguments are available (e.g., because we're messaging via some unspecialized type, id, etc.), we substitute in the type bounds for the type parameters instead. Example: @interface NSSet<T : id<NSCopying>> : NSObject <NSCopying> - (T)firstObject; @end void f(NSSet<NSString *> *stringSet, NSSet *anySet) { [stringSet firstObject]; // produces NSString* [anySet firstObject]; // produces id<NSCopying> (the bound) } When substituting for the type parameters given an unspecialized context (i.e., no specific type arguments were given), substituting the type bounds unconditionally produces type signatures that are too strong compared to the pre-generics signatures. Instead, use the following rule: - In covariant positions, such as method return types, replace type parameters with “id” or “Class” (the latter only when the type parameter bound is “Class” or qualified class, e.g, “Class<NSCopying>”) - In other positions (e.g., parameter types), replace type parameters with their type bounds. - When a specialized Objective-C object or object pointer type contains a type parameter in its type arguments (e.g., NSArray<T>*, but not NSArray<NSString *> *), replace the entire object/object pointer type with its unspecialized version (e.g., NSArray *). llvm-svn: 241543
2015-07-07 03:57:53 +00:00
Record.push_back(D->Index);
Record.AddSourceLocation(D->VarianceLoc);
Record.AddSourceLocation(D->ColonLoc);
Code = serialization::DECL_OBJC_TYPE_PARAM;
}
void ASTDeclWriter::VisitObjCContainerDecl(ObjCContainerDecl *D) {
static_assert(DeclContext::NumObjCContainerDeclBits == 64,
"You need to update the serializer after you change the "
"ObjCContainerDeclBits");
VisitNamedDecl(D);
Record.AddSourceLocation(D->getAtStartLoc());
Record.AddSourceRange(D->getAtEndRange());
// Abstract class (no need to define a stable serialization::DECL code).
}
void ASTDeclWriter::VisitObjCInterfaceDecl(ObjCInterfaceDecl *D) {
VisitRedeclarable(D);
VisitObjCContainerDecl(D);
Record.AddTypeRef(QualType(D->getTypeForDecl(), 0));
AddObjCTypeParamList(D->TypeParamList);
Record.push_back(D->isThisDeclarationADefinition());
if (D->isThisDeclarationADefinition()) {
// Write the DefinitionData
ObjCInterfaceDecl::DefinitionData &Data = D->data();
Record.AddTypeSourceInfo(D->getSuperClassTInfo());
Record.AddSourceLocation(D->getEndOfDefinitionLoc());
Record.push_back(Data.HasDesignatedInitializers);
Record.push_back(D->getODRHash());
// Write out the protocols that are directly referenced by the @interface.
Record.push_back(Data.ReferencedProtocols.size());
for (const auto *P : D->protocols())
Record.AddDeclRef(P);
for (const auto &PL : D->protocol_locs())
Record.AddSourceLocation(PL);
// Write out the protocols that are transitively referenced.
Record.push_back(Data.AllReferencedProtocols.size());
for (ObjCList<ObjCProtocolDecl>::iterator
P = Data.AllReferencedProtocols.begin(),
PEnd = Data.AllReferencedProtocols.end();
P != PEnd; ++P)
Record.AddDeclRef(*P);
if (ObjCCategoryDecl *Cat = D->getCategoryListRaw()) {
// Ensure that we write out the set of categories for this class.
Writer.ObjCClassesWithCategories.insert(D);
// Make sure that the categories get serialized.
for (; Cat; Cat = Cat->getNextClassCategoryRaw())
(void)Writer.GetDeclRef(Cat);
}
}
Code = serialization::DECL_OBJC_INTERFACE;
}
void ASTDeclWriter::VisitObjCIvarDecl(ObjCIvarDecl *D) {
VisitFieldDecl(D);
// FIXME: stable encoding for @public/@private/@protected/@package
Record.push_back(D->getAccessControl());
Record.push_back(D->getSynthesize());
if (D->getDeclContext() == D->getLexicalDeclContext() &&
!D->hasAttrs() &&
!D->isImplicit() &&
!D->isUsed(false) &&
!D->isInvalidDecl() &&
!D->isReferenced() &&
!D->isModulePrivate() &&
!D->getBitWidth() &&
!D->hasExtInfo() &&
D->getDeclName())
AbbrevToUse = Writer.getDeclObjCIvarAbbrev();
Code = serialization::DECL_OBJC_IVAR;
}
void ASTDeclWriter::VisitObjCProtocolDecl(ObjCProtocolDecl *D) {
VisitRedeclarable(D);
VisitObjCContainerDecl(D);
Record.push_back(D->isThisDeclarationADefinition());
if (D->isThisDeclarationADefinition()) {
Record.push_back(D->protocol_size());
for (const auto *I : D->protocols())
Record.AddDeclRef(I);
for (const auto &PL : D->protocol_locs())
Record.AddSourceLocation(PL);
Record.push_back(D->getODRHash());
}
Code = serialization::DECL_OBJC_PROTOCOL;
}
void ASTDeclWriter::VisitObjCAtDefsFieldDecl(ObjCAtDefsFieldDecl *D) {
VisitFieldDecl(D);
Code = serialization::DECL_OBJC_AT_DEFS_FIELD;
}
void ASTDeclWriter::VisitObjCCategoryDecl(ObjCCategoryDecl *D) {
VisitObjCContainerDecl(D);
Record.AddSourceLocation(D->getCategoryNameLoc());
Record.AddSourceLocation(D->getIvarLBraceLoc());
Record.AddSourceLocation(D->getIvarRBraceLoc());
Record.AddDeclRef(D->getClassInterface());
AddObjCTypeParamList(D->TypeParamList);
Record.push_back(D->protocol_size());
for (const auto *I : D->protocols())
Record.AddDeclRef(I);
for (const auto &PL : D->protocol_locs())
Record.AddSourceLocation(PL);
Code = serialization::DECL_OBJC_CATEGORY;
}
void ASTDeclWriter::VisitObjCCompatibleAliasDecl(ObjCCompatibleAliasDecl *D) {
VisitNamedDecl(D);
Record.AddDeclRef(D->getClassInterface());
Code = serialization::DECL_OBJC_COMPATIBLE_ALIAS;
}
void ASTDeclWriter::VisitObjCPropertyDecl(ObjCPropertyDecl *D) {
VisitNamedDecl(D);
Record.AddSourceLocation(D->getAtLoc());
Record.AddSourceLocation(D->getLParenLoc());
Record.AddTypeRef(D->getType());
Record.AddTypeSourceInfo(D->getTypeSourceInfo());
// FIXME: stable encoding
Record.push_back((unsigned)D->getPropertyAttributes());
Record.push_back((unsigned)D->getPropertyAttributesAsWritten());
// FIXME: stable encoding
Record.push_back((unsigned)D->getPropertyImplementation());
Record.AddDeclarationName(D->getGetterName());
Record.AddSourceLocation(D->getGetterNameLoc());
Record.AddDeclarationName(D->getSetterName());
Record.AddSourceLocation(D->getSetterNameLoc());
Record.AddDeclRef(D->getGetterMethodDecl());
Record.AddDeclRef(D->getSetterMethodDecl());
Record.AddDeclRef(D->getPropertyIvarDecl());
Code = serialization::DECL_OBJC_PROPERTY;
}
void ASTDeclWriter::VisitObjCImplDecl(ObjCImplDecl *D) {
VisitObjCContainerDecl(D);
Record.AddDeclRef(D->getClassInterface());
// Abstract class (no need to define a stable serialization::DECL code).
}
void ASTDeclWriter::VisitObjCCategoryImplDecl(ObjCCategoryImplDecl *D) {
VisitObjCImplDecl(D);
Record.AddSourceLocation(D->getCategoryNameLoc());
Code = serialization::DECL_OBJC_CATEGORY_IMPL;
}
void ASTDeclWriter::VisitObjCImplementationDecl(ObjCImplementationDecl *D) {
VisitObjCImplDecl(D);
Record.AddDeclRef(D->getSuperClass());
Record.AddSourceLocation(D->getSuperClassLoc());
Record.AddSourceLocation(D->getIvarLBraceLoc());
Record.AddSourceLocation(D->getIvarRBraceLoc());
Record.push_back(D->hasNonZeroConstructors());
Record.push_back(D->hasDestructors());
Record.push_back(D->NumIvarInitializers);
if (D->NumIvarInitializers)
Record.AddCXXCtorInitializers(
llvm::ArrayRef(D->init_begin(), D->init_end()));
Code = serialization::DECL_OBJC_IMPLEMENTATION;
}
void ASTDeclWriter::VisitObjCPropertyImplDecl(ObjCPropertyImplDecl *D) {
VisitDecl(D);
Record.AddSourceLocation(D->getBeginLoc());
Record.AddDeclRef(D->getPropertyDecl());
Record.AddDeclRef(D->getPropertyIvarDecl());
Record.AddSourceLocation(D->getPropertyIvarDeclLoc());
Record.AddDeclRef(D->getGetterMethodDecl());
Record.AddDeclRef(D->getSetterMethodDecl());
Record.AddStmt(D->getGetterCXXConstructor());
Record.AddStmt(D->getSetterCXXAssignment());
Code = serialization::DECL_OBJC_PROPERTY_IMPL;
}
void ASTDeclWriter::VisitFieldDecl(FieldDecl *D) {
VisitDeclaratorDecl(D);
Record.push_back(D->isMutable());
Record.push_back((D->StorageKind << 1) | D->BitField);
if (D->StorageKind == FieldDecl::ISK_CapturedVLAType)
Record.AddTypeRef(QualType(D->getCapturedVLAType(), 0));
else if (D->BitField)
Record.AddStmt(D->getBitWidth());
if (!D->getDeclName() || D->isPlaceholderVar(Writer.getLangOpts()))
Record.AddDeclRef(
Record.getASTContext().getInstantiatedFromUnnamedFieldDecl(D));
if (D->getDeclContext() == D->getLexicalDeclContext() &&
!D->hasAttrs() &&
!D->isImplicit() &&
!D->isUsed(false) &&
!D->isInvalidDecl() &&
!D->isReferenced() &&
!D->isTopLevelDeclInObjCContainer() &&
!D->isModulePrivate() &&
!D->getBitWidth() &&
!D->hasInClassInitializer() &&
!D->hasCapturedVLAType() &&
!D->hasExtInfo() &&
!ObjCIvarDecl::classofKind(D->getKind()) &&
!ObjCAtDefsFieldDecl::classofKind(D->getKind()) &&
D->getDeclName())
AbbrevToUse = Writer.getDeclFieldAbbrev();
Code = serialization::DECL_FIELD;
}
void ASTDeclWriter::VisitMSPropertyDecl(MSPropertyDecl *D) {
VisitDeclaratorDecl(D);
Record.AddIdentifierRef(D->getGetterId());
Record.AddIdentifierRef(D->getSetterId());
Code = serialization::DECL_MS_PROPERTY;
}
Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented. Summary: Previously, we treated CXXUuidofExpr as quite a special case: it was the only kind of expression that could be a canonical template argument, it could be a constant lvalue base object, and so on. In addition, we represented the UUID value as a string, whose source form we did not preserve faithfully, and that we partially parsed in multiple different places. With this patch, we create an MSGuidDecl object to represent the implicit object of type 'struct _GUID' created by a UuidAttr. Each UuidAttr holds a pointer to its 'struct _GUID' and its original (as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue representation of the GUID on the MSGuidDecl and use it from constant evaluation where needed. This allows removing a lot of the special-case logic to handle these expressions. Unfortunately, many parts of Clang assume there are only a couple of interesting kinds of ValueDecl, so the total amount of special-case logic is not really reduced very much. This fixes a few bugs and issues: * PR38490: we now support reading from GUID objects returned from __uuidof during constant evaluation. * Our Itanium mangling for a non-instantiation-dependent template argument involving __uuidof no longer depends on which CXXUuidofExpr template argument we happened to see first. * We now predeclare ::_GUID, and permit use of __uuidof without any header inclusion, better matching MSVC's behavior. We do not predefine ::__s_GUID, though; that seems like a step too far. * Our IR representation for GUID constants now uses the correct IR type wherever possible. We will still fall back to using the {i32, i16, i16, [8 x i8]} layout if a definition of struct _GUID is not available. This is not ideal: in principle the two layouts could have different padding. Reviewers: rnk, jdoerfert Subscribers: arphaman, cfe-commits, aeubanks Tags: #clang Differential Revision: https://reviews.llvm.org/D78171
2020-04-11 22:15:29 -07:00
void ASTDeclWriter::VisitMSGuidDecl(MSGuidDecl *D) {
VisitValueDecl(D);
MSGuidDecl::Parts Parts = D->getParts();
Record.push_back(Parts.Part1);
Record.push_back(Parts.Part2);
Record.push_back(Parts.Part3);
Record.append(std::begin(Parts.Part4And5), std::end(Parts.Part4And5));
Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented. Summary: Previously, we treated CXXUuidofExpr as quite a special case: it was the only kind of expression that could be a canonical template argument, it could be a constant lvalue base object, and so on. In addition, we represented the UUID value as a string, whose source form we did not preserve faithfully, and that we partially parsed in multiple different places. With this patch, we create an MSGuidDecl object to represent the implicit object of type 'struct _GUID' created by a UuidAttr. Each UuidAttr holds a pointer to its 'struct _GUID' and its original (as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue representation of the GUID on the MSGuidDecl and use it from constant evaluation where needed. This allows removing a lot of the special-case logic to handle these expressions. Unfortunately, many parts of Clang assume there are only a couple of interesting kinds of ValueDecl, so the total amount of special-case logic is not really reduced very much. This fixes a few bugs and issues: * PR38490: we now support reading from GUID objects returned from __uuidof during constant evaluation. * Our Itanium mangling for a non-instantiation-dependent template argument involving __uuidof no longer depends on which CXXUuidofExpr template argument we happened to see first. * We now predeclare ::_GUID, and permit use of __uuidof without any header inclusion, better matching MSVC's behavior. We do not predefine ::__s_GUID, though; that seems like a step too far. * Our IR representation for GUID constants now uses the correct IR type wherever possible. We will still fall back to using the {i32, i16, i16, [8 x i8]} layout if a definition of struct _GUID is not available. This is not ideal: in principle the two layouts could have different padding. Reviewers: rnk, jdoerfert Subscribers: arphaman, cfe-commits, aeubanks Tags: #clang Differential Revision: https://reviews.llvm.org/D78171
2020-04-11 22:15:29 -07:00
Code = serialization::DECL_MS_GUID;
}
void ASTDeclWriter::VisitUnnamedGlobalConstantDecl(
UnnamedGlobalConstantDecl *D) {
VisitValueDecl(D);
Record.AddAPValue(D->getValue());
Code = serialization::DECL_UNNAMED_GLOBAL_CONSTANT;
}
void ASTDeclWriter::VisitTemplateParamObjectDecl(TemplateParamObjectDecl *D) {
VisitValueDecl(D);
Record.AddAPValue(D->getValue());
Code = serialization::DECL_TEMPLATE_PARAM_OBJECT;
}
void ASTDeclWriter::VisitIndirectFieldDecl(IndirectFieldDecl *D) {
VisitValueDecl(D);
Record.push_back(D->getChainingSize());
for (const auto *P : D->chain())
Record.AddDeclRef(P);
Code = serialization::DECL_INDIRECTFIELD;
}
void ASTDeclWriter::VisitVarDecl(VarDecl *D) {
VisitRedeclarable(D);
VisitDeclaratorDecl(D);
// The order matters here. It will be better to put the bit with higher
// probability to be 0 in the end of the bits. See the comments in VisitDecl
// for details.
BitsPacker VarDeclBits;
VarDeclBits.addBits(llvm::to_underlying(D->getLinkageInternal()),
/*BitWidth=*/3);
bool ModulesCodegen = false;
if (Writer.WritingModule && D->getStorageDuration() == SD_Static &&
!D->getDescribedVarTemplate()) {
// When building a C++20 module interface unit or a partition unit, a
// strong definition in the module interface is provided by the
// compilation of that unit, not by its users. (Inline variables are still
// emitted in module users.)
ModulesCodegen = (Writer.WritingModule->isInterfaceOrPartition() ||
(D->hasAttr<DLLExportAttr>() &&
Writer.getLangOpts().BuildingPCHWithObjectFile)) &&
Record.getASTContext().GetGVALinkageForVariable(D) >=
GVA_StrongExternal;
}
VarDeclBits.addBit(ModulesCodegen);
VarDeclBits.addBits(D->getStorageClass(), /*BitWidth=*/3);
VarDeclBits.addBits(D->getTSCSpec(), /*BitWidth=*/2);
VarDeclBits.addBits(D->getInitStyle(), /*BitWidth=*/2);
VarDeclBits.addBit(D->isARCPseudoStrong());
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
bool HasDeducedType = false;
if (!isa<ParmVarDecl>(D)) {
VarDeclBits.addBit(D->isThisDeclarationADemotedDefinition());
VarDeclBits.addBit(D->isExceptionVariable());
VarDeclBits.addBit(D->isNRVOVariable());
VarDeclBits.addBit(D->isCXXForRangeDecl());
VarDeclBits.addBit(D->isInline());
VarDeclBits.addBit(D->isInlineSpecified());
VarDeclBits.addBit(D->isConstexpr());
VarDeclBits.addBit(D->isInitCapture());
VarDeclBits.addBit(D->isPreviousDeclInSameBlockScope());
VarDeclBits.addBit(D->isEscapingByref());
HasDeducedType = D->getType()->getContainedDeducedType();
VarDeclBits.addBit(HasDeducedType);
if (const auto *IPD = dyn_cast<ImplicitParamDecl>(D))
VarDeclBits.addBits(llvm::to_underlying(IPD->getParameterKind()),
/*Width=*/3);
else
VarDeclBits.addBits(0, /*Width=*/3);
VarDeclBits.addBit(D->isObjCForDecl());
}
Record.push_back(VarDeclBits);
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
if (ModulesCodegen)
Writer.AddDeclRef(D, Writer.ModularCodegenDecls);
if (D->hasAttr<BlocksAttr>()) {
BlockVarCopyInit Init = Record.getASTContext().getBlockVarCopyInit(D);
Record.AddStmt(Init.getCopyExpr());
if (Init.getCopyExpr())
Record.push_back(Init.canThrow());
}
enum {
VarNotTemplate = 0, VarTemplate, StaticDataMemberSpecialization
};
if (VarTemplateDecl *TemplD = D->getDescribedVarTemplate()) {
Record.push_back(VarTemplate);
Record.AddDeclRef(TemplD);
} else if (MemberSpecializationInfo *SpecInfo
= D->getMemberSpecializationInfo()) {
Record.push_back(StaticDataMemberSpecialization);
Record.AddDeclRef(SpecInfo->getInstantiatedFrom());
Record.push_back(SpecInfo->getTemplateSpecializationKind());
Record.AddSourceLocation(SpecInfo->getPointOfInstantiation());
} else {
Record.push_back(VarNotTemplate);
}
if (D->getDeclContext() == D->getLexicalDeclContext() && !D->hasAttrs() &&
!D->isTopLevelDeclInObjCContainer() &&
!needsAnonymousDeclarationNumber(D) &&
D->getDeclName().getNameKind() == DeclarationName::Identifier &&
!D->hasExtInfo() && D->getFirstDecl() == D->getMostRecentDecl() &&
D->getKind() == Decl::Var && !D->isInline() && !D->isConstexpr() &&
!D->isInitCapture() && !D->isPreviousDeclInSameBlockScope() &&
!D->isEscapingByref() && !HasDeducedType &&
D->getStorageDuration() != SD_Static && !D->getDescribedVarTemplate() &&
!D->getMemberSpecializationInfo() && !D->isObjCForDecl() &&
!isa<ImplicitParamDecl>(D) && !D->isEscapingByref())
AbbrevToUse = Writer.getDeclVarAbbrev();
Code = serialization::DECL_VAR;
}
void ASTDeclWriter::VisitImplicitParamDecl(ImplicitParamDecl *D) {
VisitVarDecl(D);
Code = serialization::DECL_IMPLICIT_PARAM;
}
void ASTDeclWriter::VisitParmVarDecl(ParmVarDecl *D) {
VisitVarDecl(D);
// See the implementation of `ParmVarDecl::getParameterIndex()`, which may
// exceed the size of the normal bitfield. So it may be better to not pack
// these bits.
Record.push_back(D->getFunctionScopeIndex());
BitsPacker ParmVarDeclBits;
ParmVarDeclBits.addBit(D->isObjCMethodParameter());
ParmVarDeclBits.addBits(D->getFunctionScopeDepth(), /*BitsWidth=*/7);
// FIXME: stable encoding
ParmVarDeclBits.addBits(D->getObjCDeclQualifier(), /*BitsWidth=*/7);
ParmVarDeclBits.addBit(D->isKNRPromoted());
ParmVarDeclBits.addBit(D->hasInheritedDefaultArg());
ParmVarDeclBits.addBit(D->hasUninstantiatedDefaultArg());
ParmVarDeclBits.addBit(D->getExplicitObjectParamThisLoc().isValid());
Record.push_back(ParmVarDeclBits);
if (D->hasUninstantiatedDefaultArg())
Record.AddStmt(D->getUninstantiatedDefaultArg());
if (D->getExplicitObjectParamThisLoc().isValid())
Record.AddSourceLocation(D->getExplicitObjectParamThisLoc());
Code = serialization::DECL_PARM_VAR;
// If the assumptions about the DECL_PARM_VAR abbrev are true, use it. Here
// we dynamically check for the properties that we optimize for, but don't
// know are true of all PARM_VAR_DECLs.
if (D->getDeclContext() == D->getLexicalDeclContext() && !D->hasAttrs() &&
!D->hasExtInfo() && D->getStorageClass() == 0 && !D->isInvalidDecl() &&
!D->isTopLevelDeclInObjCContainer() &&
Represent C++ direct initializers as ParenListExprs before semantic analysis instead of having a special-purpose function. - ActOnCXXDirectInitializer, which was mostly duplication of AddInitializerToDecl (leading e.g. to PR10620, which Eli fixed a few days ago), is dropped completely. - MultiInitializer, which was an ugly hack I added, is dropped again. - We now have the infrastructure in place to distinguish between int x = {1}; int x({1}); int x{1}; -- VarDecl now has getInitStyle(), which indicates which of the above was used. -- CXXConstructExpr now has a flag to indicate that it represents list- initialization, although this is not yet used. - InstantiateInitializer was renamed to SubstInitializer and simplified. - ActOnParenOrParenListExpr has been replaced by ActOnParenListExpr, which always produces a ParenListExpr. Placed that so far failed to convert that back to a ParenExpr containing comma operators have been fixed. I'm pretty sure I could have made a crashing test case before this. The end result is a (I hope) considerably cleaner design of initializers. More importantly, the fact that I can now distinguish between the various initialization kinds means that I can get the tricky generalized initializer test cases Johannes Schaub supplied to work. (This is not yet done.) This commit passed self-host, with the resulting compiler passing the tests. I hope it doesn't break more complicated code. It's a pretty big change, but one that I feel is necessary. llvm-svn: 150318
2012-02-11 23:51:47 +00:00
D->getInitStyle() == VarDecl::CInit && // Can params have anything else?
D->getInit() == nullptr) // No default expr.
AbbrevToUse = Writer.getDeclParmVarAbbrev();
// Check things we know are true of *every* PARM_VAR_DECL, which is more than
// just us assuming it.
assert(!D->getTSCSpec() && "PARM_VAR_DECL can't use TLS");
assert(!D->isThisDeclarationADemotedDefinition()
&& "PARM_VAR_DECL can't be demoted definition.");
assert(D->getAccess() == AS_none && "PARM_VAR_DECL can't be public/private");
assert(!D->isExceptionVariable() && "PARM_VAR_DECL can't be exception var");
assert(D->getPreviousDecl() == nullptr && "PARM_VAR_DECL can't be redecl");
assert(!D->isStaticDataMember() &&
"PARM_VAR_DECL can't be static data member");
}
void ASTDeclWriter::VisitDecompositionDecl(DecompositionDecl *D) {
// Record the number of bindings first to simplify deserialization.
Record.push_back(D->bindings().size());
VisitVarDecl(D);
for (auto *B : D->bindings())
Record.AddDeclRef(B);
Code = serialization::DECL_DECOMPOSITION;
}
void ASTDeclWriter::VisitBindingDecl(BindingDecl *D) {
VisitValueDecl(D);
Record.AddStmt(D->getBinding());
Code = serialization::DECL_BINDING;
}
void ASTDeclWriter::VisitFileScopeAsmDecl(FileScopeAsmDecl *D) {
VisitDecl(D);
Record.AddStmt(D->getAsmString());
Record.AddSourceLocation(D->getRParenLoc());
Code = serialization::DECL_FILE_SCOPE_ASM;
}
[clang-repl] Support statements on global scope in incremental mode. This patch teaches clang to parse statements on the global scope to allow: ``` ./bin/clang-repl clang-repl> int i = 12; clang-repl> ++i; clang-repl> extern "C" int printf(const char*,...); clang-repl> printf("%d\n", i); 13 clang-repl> %quit ``` Generally, disambiguating between statements and declarations is a non-trivial task for a C++ parser. The challenge is to allow both standard C++ to be translated as if this patch does not exist and in the cases where the user typed a statement to be executed as if it were in a function body. Clang's Parser does pretty well in disambiguating between declarations and expressions. We have added DisambiguatingWithExpression flag which allows us to preserve the existing and optimized behavior where needed and implement the extra rules for disambiguating. Only few cases require additional attention: * Constructors/destructors -- Parser::isConstructorDeclarator was used in to disambiguate between ctor-looking declarations and statements on the global scope(eg. `Ns::f()`). * The template keyword -- the template keyword can appear in both declarations and statements. This patch considers the template keyword to be a declaration starter which breaks a few cases in incremental mode which will be tackled later. * The inline (and similar) keyword -- looking at the first token in many cases allows us to classify what is a declaration. * Other language keywords and specifiers -- ObjC/ObjC++/OpenCL/OpenMP rely on pragmas or special tokens which will be handled in subsequent patches. The patch conceptually models a "top-level" statement into a TopLevelStmtDecl. The TopLevelStmtDecl is lowered into a void function with no arguments. We attach this function to the global initializer list to execute the statement blocks in the correct order. Differential revision: https://reviews.llvm.org/D127284
2022-06-08 09:59:40 +00:00
void ASTDeclWriter::VisitTopLevelStmtDecl(TopLevelStmtDecl *D) {
VisitDecl(D);
Record.AddStmt(D->getStmt());
Code = serialization::DECL_TOP_LEVEL_STMT_DECL;
}
void ASTDeclWriter::VisitEmptyDecl(EmptyDecl *D) {
VisitDecl(D);
Code = serialization::DECL_EMPTY;
}
void ASTDeclWriter::VisitLifetimeExtendedTemporaryDecl(
LifetimeExtendedTemporaryDecl *D) {
VisitDecl(D);
Record.AddDeclRef(D->getExtendingDecl());
Record.AddStmt(D->getTemporaryExpr());
Record.push_back(static_cast<bool>(D->getValue()));
if (D->getValue())
Record.AddAPValue(*D->getValue());
Record.push_back(D->getManglingNumber());
Code = serialization::DECL_LIFETIME_EXTENDED_TEMPORARY;
}
void ASTDeclWriter::VisitBlockDecl(BlockDecl *D) {
VisitDecl(D);
Record.AddStmt(D->getBody());
Record.AddTypeSourceInfo(D->getSignatureAsWritten());
Record.push_back(D->param_size());
for (ParmVarDecl *P : D->parameters())
Record.AddDeclRef(P);
Record.push_back(D->isVariadic());
Record.push_back(D->blockMissingReturnType());
Record.push_back(D->isConversionFromLambda());
Record.push_back(D->doesNotEscape());
Record.push_back(D->canAvoidCopyToHeap());
Record.push_back(D->capturesCXXThis());
Record.push_back(D->getNumCaptures());
for (const auto &capture : D->captures()) {
Record.AddDeclRef(capture.getVariable());
unsigned flags = 0;
if (capture.isByRef()) flags |= 1;
if (capture.isNested()) flags |= 2;
if (capture.hasCopyExpr()) flags |= 4;
Record.push_back(flags);
if (capture.hasCopyExpr()) Record.AddStmt(capture.getCopyExpr());
}
Code = serialization::DECL_BLOCK;
}
void ASTDeclWriter::VisitCapturedDecl(CapturedDecl *CD) {
Record.push_back(CD->getNumParams());
VisitDecl(CD);
Record.push_back(CD->getContextParamPosition());
Record.push_back(CD->isNothrow() ? 1 : 0);
// Body is stored by VisitCapturedStmt.
for (unsigned I = 0; I < CD->getNumParams(); ++I)
Record.AddDeclRef(CD->getParam(I));
Code = serialization::DECL_CAPTURED;
}
void ASTDeclWriter::VisitLinkageSpecDecl(LinkageSpecDecl *D) {
static_assert(DeclContext::NumLinkageSpecDeclBits == 17,
"You need to update the serializer after you change the"
"LinkageSpecDeclBits");
VisitDecl(D);
Record.push_back(llvm::to_underlying(D->getLanguage()));
Record.AddSourceLocation(D->getExternLoc());
Record.AddSourceLocation(D->getRBraceLoc());
Code = serialization::DECL_LINKAGE_SPEC;
}
void ASTDeclWriter::VisitExportDecl(ExportDecl *D) {
VisitDecl(D);
Record.AddSourceLocation(D->getRBraceLoc());
Code = serialization::DECL_EXPORT;
}
void ASTDeclWriter::VisitLabelDecl(LabelDecl *D) {
VisitNamedDecl(D);
Record.AddSourceLocation(D->getBeginLoc());
Code = serialization::DECL_LABEL;
}
void ASTDeclWriter::VisitNamespaceDecl(NamespaceDecl *D) {
VisitRedeclarable(D);
VisitNamedDecl(D);
BitsPacker NamespaceDeclBits;
NamespaceDeclBits.addBit(D->isInline());
NamespaceDeclBits.addBit(D->isNested());
Record.push_back(NamespaceDeclBits);
Record.AddSourceLocation(D->getBeginLoc());
Record.AddSourceLocation(D->getRBraceLoc());
if (D->isFirstDecl())
Record.AddDeclRef(D->getAnonymousNamespace());
Code = serialization::DECL_NAMESPACE;
if (Writer.hasChain() && D->isAnonymousNamespace() &&
D == D->getMostRecentDecl()) {
// This is a most recent reopening of the anonymous namespace. If its parent
// is in a previous PCH (or is the TU), mark that parent for update, because
// the original namespace always points to the latest re-opening of its
// anonymous namespace.
Decl *Parent = cast<Decl>(
D->getParent()->getRedeclContext()->getPrimaryContext());
if (Parent->isFromASTFile() || isa<TranslationUnitDecl>(Parent)) {
Writer.DeclUpdates[Parent].push_back(
ASTWriter::DeclUpdate(UPD_CXX_ADDED_ANONYMOUS_NAMESPACE, D));
}
}
}
void ASTDeclWriter::VisitNamespaceAliasDecl(NamespaceAliasDecl *D) {
VisitRedeclarable(D);
VisitNamedDecl(D);
Record.AddSourceLocation(D->getNamespaceLoc());
Record.AddSourceLocation(D->getTargetNameLoc());
Record.AddNestedNameSpecifierLoc(D->getQualifierLoc());
Record.AddDeclRef(D->getNamespace());
Code = serialization::DECL_NAMESPACE_ALIAS;
}
void ASTDeclWriter::VisitUsingDecl(UsingDecl *D) {
VisitNamedDecl(D);
Record.AddSourceLocation(D->getUsingLoc());
Record.AddNestedNameSpecifierLoc(D->getQualifierLoc());
Record.AddDeclarationNameLoc(D->DNLoc, D->getDeclName());
Record.AddDeclRef(D->FirstUsingShadow.getPointer());
Record.push_back(D->hasTypename());
Record.AddDeclRef(Record.getASTContext().getInstantiatedFromUsingDecl(D));
Code = serialization::DECL_USING;
}
void ASTDeclWriter::VisitUsingEnumDecl(UsingEnumDecl *D) {
VisitNamedDecl(D);
Record.AddSourceLocation(D->getUsingLoc());
Record.AddSourceLocation(D->getEnumLoc());
Record.AddTypeSourceInfo(D->getEnumType());
Record.AddDeclRef(D->FirstUsingShadow.getPointer());
Record.AddDeclRef(Record.getASTContext().getInstantiatedFromUsingEnumDecl(D));
Code = serialization::DECL_USING_ENUM;
}
void ASTDeclWriter::VisitUsingPackDecl(UsingPackDecl *D) {
Record.push_back(D->NumExpansions);
VisitNamedDecl(D);
Record.AddDeclRef(D->getInstantiatedFromUsingDecl());
for (auto *E : D->expansions())
Record.AddDeclRef(E);
Code = serialization::DECL_USING_PACK;
}
void ASTDeclWriter::VisitUsingShadowDecl(UsingShadowDecl *D) {
VisitRedeclarable(D);
VisitNamedDecl(D);
Record.AddDeclRef(D->getTargetDecl());
Record.push_back(D->getIdentifierNamespace());
Record.AddDeclRef(D->UsingOrNextShadow);
Record.AddDeclRef(
Record.getASTContext().getInstantiatedFromUsingShadowDecl(D));
if (D->getDeclContext() == D->getLexicalDeclContext() &&
D->getFirstDecl() == D->getMostRecentDecl() && !D->hasAttrs() &&
!needsAnonymousDeclarationNumber(D) &&
D->getDeclName().getNameKind() == DeclarationName::Identifier)
AbbrevToUse = Writer.getDeclUsingShadowAbbrev();
Code = serialization::DECL_USING_SHADOW;
}
P0136R1, DR1573, DR1645, DR1715, DR1736, DR1903, DR1941, DR1959, DR1991: Replace inheriting constructors implementation with new approach, voted into C++ last year as a DR against C++11. Instead of synthesizing a set of derived class constructors for each inherited base class constructor, we make the constructors of the base class visible to constructor lookup in the derived class, using the normal rules for using-declarations. For constructors, UsingShadowDecl now has a ConstructorUsingShadowDecl derived class that tracks the requisite additional information. We create shadow constructors (not found by name lookup) in the derived class to model the actual initialization, and have a new expression node, CXXInheritedCtorInitExpr, to model the initialization of a base class from such a constructor. (This initialization is special because it performs real perfect forwarding of arguments.) In cases where argument forwarding is not possible (for inalloca calls, variadic calls, and calls with callee parameter cleanup), the shadow inheriting constructor is not emitted and instead we directly emit the initialization code into the caller of the inherited constructor. Note that this new model is not perfectly compatible with the old model in some corner cases. In particular: * if B inherits a private constructor from A, and C uses that constructor to construct a B, then we previously required that A befriends B and B befriends C, but the new rules require A to befriend C directly, and * if a derived class has its own constructors (and so its implicit default constructor is suppressed), it may still inherit a default constructor from a base class llvm-svn: 274049
2016-06-28 19:03:57 +00:00
void ASTDeclWriter::VisitConstructorUsingShadowDecl(
ConstructorUsingShadowDecl *D) {
VisitUsingShadowDecl(D);
Record.AddDeclRef(D->NominatedBaseClassShadowDecl);
Record.AddDeclRef(D->ConstructedBaseClassShadowDecl);
Record.push_back(D->IsVirtual);
Code = serialization::DECL_CONSTRUCTOR_USING_SHADOW;
}
void ASTDeclWriter::VisitUsingDirectiveDecl(UsingDirectiveDecl *D) {
VisitNamedDecl(D);
Record.AddSourceLocation(D->getUsingLoc());
Record.AddSourceLocation(D->getNamespaceKeyLocation());
Record.AddNestedNameSpecifierLoc(D->getQualifierLoc());
Record.AddDeclRef(D->getNominatedNamespace());
Record.AddDeclRef(dyn_cast<Decl>(D->getCommonAncestor()));
Code = serialization::DECL_USING_DIRECTIVE;
}
void ASTDeclWriter::VisitUnresolvedUsingValueDecl(UnresolvedUsingValueDecl *D) {
VisitValueDecl(D);
Record.AddSourceLocation(D->getUsingLoc());
Record.AddNestedNameSpecifierLoc(D->getQualifierLoc());
Record.AddDeclarationNameLoc(D->DNLoc, D->getDeclName());
Record.AddSourceLocation(D->getEllipsisLoc());
Code = serialization::DECL_UNRESOLVED_USING_VALUE;
}
void ASTDeclWriter::VisitUnresolvedUsingTypenameDecl(
UnresolvedUsingTypenameDecl *D) {
VisitTypeDecl(D);
Record.AddSourceLocation(D->getTypenameLoc());
Record.AddNestedNameSpecifierLoc(D->getQualifierLoc());
Record.AddSourceLocation(D->getEllipsisLoc());
Code = serialization::DECL_UNRESOLVED_USING_TYPENAME;
}
void ASTDeclWriter::VisitUnresolvedUsingIfExistsDecl(
UnresolvedUsingIfExistsDecl *D) {
VisitNamedDecl(D);
Code = serialization::DECL_UNRESOLVED_USING_IF_EXISTS;
}
void ASTDeclWriter::VisitCXXRecordDecl(CXXRecordDecl *D) {
VisitRecordDecl(D);
enum {
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
CXXRecNotTemplate = 0,
CXXRecTemplate,
CXXRecMemberSpecialization,
CXXLambda
};
if (ClassTemplateDecl *TemplD = D->getDescribedClassTemplate()) {
Record.push_back(CXXRecTemplate);
Record.AddDeclRef(TemplD);
} else if (MemberSpecializationInfo *MSInfo
= D->getMemberSpecializationInfo()) {
Record.push_back(CXXRecMemberSpecialization);
Record.AddDeclRef(MSInfo->getInstantiatedFrom());
Record.push_back(MSInfo->getTemplateSpecializationKind());
Record.AddSourceLocation(MSInfo->getPointOfInstantiation());
PR60985: Fix merging of lambda closure types across modules. Previously, distinct lambdas would get merged, and multiple definitions of the same lambda would not get merged, because we attempted to identify lambdas by their ordinal position within their lexical DeclContext. This failed for lambdas within namespace-scope variables and within variable templates, where the lexical position in the context containing the variable didn't uniquely identify the lambda. In this patch, we instead identify lambda closure types by index within their context declaration, which does uniquely identify them in a way that's consistent across modules. This change causes a deserialization cycle between the type of a variable with deduced type and a lambda appearing as the initializer of the variable -- reading the variable's type requires reading and merging the lambda, and reading the lambda requires reading and merging the variable. This is addressed by deferring loading the deduced type of a variable until after we finish recursive deserialization. This also exposes a pre-existing subtle issue where loading a variable declaration would trigger immediate loading of its initializer, which could recursively refer back to properties of the variable. This particularly causes problems if the initializer contains a lambda-expression, but can be problematic in general. That is addressed by switching to lazily loading the initializers of variables rather than always loading them with the variable declaration. As well as fixing a deserialization cycle, that should improve laziness of deserialization in general. LambdaDefinitionData had 63 spare bits in it, presumably caused by an off-by-one-error in some previous change. This change claims 32 of those bits as a counter for the lambda within its context. We could probably move the numbering to separate storage, like we do for the device-side mangling number, to optimize the likely-common case where all three numbers (host-side mangling number, device-side mangling number, and index within the context declaration) are zero, but that's not done in this change. Fixes #60985. Reviewed By: #clang-language-wg, aaron.ballman Differential Revision: https://reviews.llvm.org/D145737
2023-03-30 14:21:31 -07:00
} else if (D->isLambda()) {
// For a lambda, we need some information early for merging.
Record.push_back(CXXLambda);
if (auto *Context = D->getLambdaContextDecl()) {
Record.AddDeclRef(Context);
Record.push_back(D->getLambdaIndexInContext());
} else {
Record.push_back(0);
}
[C++20][Modules] Fix crash when function and lambda inside loaded from different modules (#109167) Summary: Because AST loading code is lazy and happens in unpredictable order, it is possible that a function and lambda inside the function can be loaded from different modules. As a result, the captured DeclRefExpr won’t match the corresponding VarDecl inside the function. This situation is reflected in the AST as follows: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff modifies the AST deserialization process to load lambdas within the canonical function declaration sooner, immediately following the function, ensuring that they are loaded from the same module. Re-land https://github.com/llvm/llvm-project/pull/104512 Added test case that caused crash due to multiple enclosed lambdas deserialization. Test Plan: check-clang
2024-09-25 08:31:49 +01:00
// For lambdas inside canonical FunctionDecl remember the mapping.
if (auto FD = llvm::dyn_cast_or_null<FunctionDecl>(D->getDeclContext());
FD && FD->isCanonicalDecl()) {
Writer.RelatedDeclsMap[Writer.GetDeclRef(FD)].push_back(
Writer.GetDeclRef(D));
[C++20][Modules] Fix crash when function and lambda inside loaded from different modules (#109167) Summary: Because AST loading code is lazy and happens in unpredictable order, it is possible that a function and lambda inside the function can be loaded from different modules. As a result, the captured DeclRefExpr won’t match the corresponding VarDecl inside the function. This situation is reflected in the AST as follows: ``` FunctionDecl 0x555564f4aff0 <Conv.h:33:1, line:41:1> line:33:35 imported in ./thrift_cpp2_base.h hidden tryTo 'Expected<Tgt, const char *> ()' inline |-also in ./folly-conv.h `-CompoundStmt 0x555564f7cfc8 <col:43, line:41:1> |-DeclStmt 0x555564f7ced8 <line:34:3, col:17> | `-VarDecl 0x555564f7cef8 <col:3, col:16> col:7 imported in ./thrift_cpp2_base.h hidden referenced result 'Tgt' cinit | `-IntegerLiteral 0x555564f7d080 <col:16> 'int' 0 |-CallExpr 0x555564f7cea8 <line:39:3, col:76> '<dependent type>' | |-UnresolvedLookupExpr 0x555564f7bea0 <col:3, col:19> '<overloaded function type>' lvalue (no ADL) = 'then_' 0x555564f7bef0 | |-CXXTemporaryObjectExpr 0x555564f7bcb0 <col:25, col:45> 'Expected<bool, int>':'folly::Expected<bool, int>' 'void () noexcept' zeroing | `-LambdaExpr 0x555564f7bc88 <col:48, col:75> '(lambda at Conv.h:39:48)' | |-CXXRecordDecl 0x555564f76b88 <col:48> col:48 imported in ./folly-conv.h hidden implicit <undeserialized declarations> class definition | | |-also in ./thrift_cpp2_base.h | | `-DefinitionData lambda empty standard_layout trivially_copyable literal can_const_default_init | | |-DefaultConstructor defaulted_is_constexpr | | |-CopyConstructor simple trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveConstructor exists simple trivial needs_implicit | | |-CopyAssignment trivial has_const_param needs_implicit implicit_has_const_param | | |-MoveAssignment | | `-Destructor simple irrelevant trivial constexpr needs_implicit | `-CompoundStmt 0x555564f7d1a8 <col:58, col:75> | `-ReturnStmt 0x555564f7d198 <col:60, col:67> | `-DeclRefExpr 0x555564f7d0a0 <col:67> 'Tgt' lvalue Var 0x555564f7d0c8 'result' 'Tgt' refers_to_enclosing_variable_or_capture `-ReturnStmt 0x555564f7bc78 <line:40:3, col:11> `-InitListExpr 0x555564f7bc38 <col:10, col:11> 'void' ``` This diff modifies the AST deserialization process to load lambdas within the canonical function declaration sooner, immediately following the function, ensuring that they are loaded from the same module. Re-land https://github.com/llvm/llvm-project/pull/104512 Added test case that caused crash due to multiple enclosed lambdas deserialization. Test Plan: check-clang
2024-09-25 08:31:49 +01:00
}
} else {
Record.push_back(CXXRecNotTemplate);
}
Record.push_back(D->isThisDeclarationADefinition());
if (D->isThisDeclarationADefinition())
Record.AddCXXDefinitionData(D);
if (D->isCompleteDefinition() && D->isInNamedModule())
Writer.AddDeclRef(D, Writer.ModularCodegenDecls);
// Store (what we currently believe to be) the key function to avoid
// deserializing every method so we can compute it.
//
// FIXME: Avoid adding the key function if the class is defined in
// module purview since in that case the key function is meaningless.
[AST][1/4] Move the bit-fields from TagDecl, EnumDecl and RecordDecl into DeclContext DeclContext has a little less than 8 bytes free due to the alignment requirements on 64 bits archs. This set of patches moves the bit-fields from classes deriving from DeclContext into DeclContext. On 32 bits archs this increases the size of DeclContext by 4 bytes but this is balanced by an equal or larger reduction in the size of the classes deriving from it. On 64 bits archs the size of DeclContext stays the same but most of the classes deriving from it shrink by 8/16 bytes. (-print-stats diff here https://reviews.llvm.org/D49728) When doing an -fsyntax-only on all of Boost this result in a 3.6% reduction in the size of all Decls and a 1% reduction in the run time due to the lower cache miss rate. For now CXXRecordDecl is not touched but there is an easy 6 (if I count correctly) bytes gain available there by moving some bits from DefinitionData into the free space of DeclContext. This will be the subject of another patch. This patch sequence also enable the possibility of refactoring FunctionDecl: To save space some bits from classes deriving from FunctionDecl were moved to FunctionDecl. This resulted in a lot of stuff in FunctionDecl which do not belong logically to it. After this set of patches however it is just a simple matter of adding a SomethingDeclBitfields in DeclContext and moving the bits to it from FunctionDecl. This first patch introduces the anonymous union in DeclContext and all the *DeclBitfields classes holding the bit-fields, and moves the bits from TagDecl, EnumDecl and RecordDecl into DeclContext. This patch is followed by https://reviews.llvm.org/D49732, https://reviews.llvm.org/D49733 and https://reviews.llvm.org/D49734. Differential Revision: https://reviews.llvm.org/D49729 Patch By: bricci llvm-svn: 338630
2018-08-01 20:48:16 +00:00
if (D->isCompleteDefinition())
Record.AddDeclRef(Record.getASTContext().getCurrentKeyFunction(D));
Code = serialization::DECL_CXX_RECORD;
}
void ASTDeclWriter::VisitCXXMethodDecl(CXXMethodDecl *D) {
VisitFunctionDecl(D);
if (D->isCanonicalDecl()) {
Record.push_back(D->size_overridden_methods());
for (const CXXMethodDecl *MD : D->overridden_methods())
Record.AddDeclRef(MD);
} else {
// We only need to record overridden methods once for the canonical decl.
Record.push_back(0);
}
if (D->getDeclContext() == D->getLexicalDeclContext() &&
D->getFirstDecl() == D->getMostRecentDecl() && !D->isInvalidDecl() &&
!D->hasAttrs() && !D->isTopLevelDeclInObjCContainer() &&
D->getDeclName().getNameKind() == DeclarationName::Identifier &&
!D->hasExtInfo() && !D->isExplicitlyDefaulted()) {
if (D->getTemplatedKind() == FunctionDecl::TK_NonTemplate ||
D->getTemplatedKind() == FunctionDecl::TK_FunctionTemplate ||
D->getTemplatedKind() == FunctionDecl::TK_MemberSpecialization ||
D->getTemplatedKind() == FunctionDecl::TK_DependentNonTemplate)
AbbrevToUse = Writer.getDeclCXXMethodAbbrev(D->getTemplatedKind());
else if (D->getTemplatedKind() ==
FunctionDecl::TK_FunctionTemplateSpecialization) {
FunctionTemplateSpecializationInfo *FTSInfo =
D->getTemplateSpecializationInfo();
if (FTSInfo->TemplateArguments->size() == 1) {
const TemplateArgument &TA = FTSInfo->TemplateArguments->get(0);
if (TA.getKind() == TemplateArgument::Type &&
!FTSInfo->TemplateArgumentsAsWritten &&
!FTSInfo->getMemberSpecializationInfo())
AbbrevToUse = Writer.getDeclCXXMethodAbbrev(D->getTemplatedKind());
}
} else if (D->getTemplatedKind() ==
FunctionDecl::TK_DependentFunctionTemplateSpecialization) {
DependentFunctionTemplateSpecializationInfo *DFTSInfo =
D->getDependentSpecializationInfo();
if (!DFTSInfo->TemplateArgumentsAsWritten)
AbbrevToUse = Writer.getDeclCXXMethodAbbrev(D->getTemplatedKind());
}
}
Code = serialization::DECL_CXX_METHOD;
}
void ASTDeclWriter::VisitCXXConstructorDecl(CXXConstructorDecl *D) {
static_assert(DeclContext::NumCXXConstructorDeclBits == 64,
"You need to update the serializer after you change the "
"CXXConstructorDeclBits");
Record.push_back(D->getTrailingAllocKind());
addExplicitSpecifier(D->getExplicitSpecifier(), Record);
P0136R1, DR1573, DR1645, DR1715, DR1736, DR1903, DR1941, DR1959, DR1991: Replace inheriting constructors implementation with new approach, voted into C++ last year as a DR against C++11. Instead of synthesizing a set of derived class constructors for each inherited base class constructor, we make the constructors of the base class visible to constructor lookup in the derived class, using the normal rules for using-declarations. For constructors, UsingShadowDecl now has a ConstructorUsingShadowDecl derived class that tracks the requisite additional information. We create shadow constructors (not found by name lookup) in the derived class to model the actual initialization, and have a new expression node, CXXInheritedCtorInitExpr, to model the initialization of a base class from such a constructor. (This initialization is special because it performs real perfect forwarding of arguments.) In cases where argument forwarding is not possible (for inalloca calls, variadic calls, and calls with callee parameter cleanup), the shadow inheriting constructor is not emitted and instead we directly emit the initialization code into the caller of the inherited constructor. Note that this new model is not perfectly compatible with the old model in some corner cases. In particular: * if B inherits a private constructor from A, and C uses that constructor to construct a B, then we previously required that A befriends B and B befriends C, but the new rules require A to befriend C directly, and * if a derived class has its own constructors (and so its implicit default constructor is suppressed), it may still inherit a default constructor from a base class llvm-svn: 274049
2016-06-28 19:03:57 +00:00
if (auto Inherited = D->getInheritedConstructor()) {
Record.AddDeclRef(Inherited.getShadowDecl());
Record.AddDeclRef(Inherited.getConstructor());
}
VisitCXXMethodDecl(D);
Code = serialization::DECL_CXX_CONSTRUCTOR;
}
void ASTDeclWriter::VisitCXXDestructorDecl(CXXDestructorDecl *D) {
VisitCXXMethodDecl(D);
Record.AddDeclRef(D->getOperatorDelete());
if (D->getOperatorDelete())
Record.AddStmt(D->getOperatorDeleteThisArg());
Code = serialization::DECL_CXX_DESTRUCTOR;
}
void ASTDeclWriter::VisitCXXConversionDecl(CXXConversionDecl *D) {
addExplicitSpecifier(D->getExplicitSpecifier(), Record);
VisitCXXMethodDecl(D);
Code = serialization::DECL_CXX_CONVERSION;
}
void ASTDeclWriter::VisitImportDecl(ImportDecl *D) {
VisitDecl(D);
Record.push_back(Writer.getSubmoduleID(D->getImportedModule()));
ArrayRef<SourceLocation> IdentifierLocs = D->getIdentifierLocs();
Record.push_back(!IdentifierLocs.empty());
if (IdentifierLocs.empty()) {
Record.AddSourceLocation(D->getEndLoc());
Record.push_back(1);
} else {
for (unsigned I = 0, N = IdentifierLocs.size(); I != N; ++I)
Record.AddSourceLocation(IdentifierLocs[I]);
Record.push_back(IdentifierLocs.size());
}
// Note: the number of source locations must always be the last element in
// the record.
Code = serialization::DECL_IMPORT;
}
void ASTDeclWriter::VisitAccessSpecDecl(AccessSpecDecl *D) {
VisitDecl(D);
Record.AddSourceLocation(D->getColonLoc());
Code = serialization::DECL_ACCESS_SPEC;
}
void ASTDeclWriter::VisitFriendDecl(FriendDecl *D) {
// Record the number of friend type template parameter lists here
// so as to simplify memory allocation during deserialization.
Record.push_back(D->NumTPLists);
VisitDecl(D);
bool hasFriendDecl = isa<NamedDecl *>(D->Friend);
Record.push_back(hasFriendDecl);
if (hasFriendDecl)
Record.AddDeclRef(D->getFriendDecl());
else
Record.AddTypeSourceInfo(D->getFriendType());
for (unsigned i = 0; i < D->NumTPLists; ++i)
Record.AddTemplateParameterList(D->getFriendTypeTemplateParameterList(i));
Record.AddDeclRef(D->getNextFriend());
Record.push_back(D->UnsupportedFriend);
Record.AddSourceLocation(D->FriendLoc);
Record.AddSourceLocation(D->EllipsisLoc);
Code = serialization::DECL_FRIEND;
}
void ASTDeclWriter::VisitFriendTemplateDecl(FriendTemplateDecl *D) {
VisitDecl(D);
Record.push_back(D->getNumTemplateParameters());
for (unsigned i = 0, e = D->getNumTemplateParameters(); i != e; ++i)
Record.AddTemplateParameterList(D->getTemplateParameterList(i));
Record.push_back(D->getFriendDecl() != nullptr);
if (D->getFriendDecl())
Record.AddDeclRef(D->getFriendDecl());
else
Record.AddTypeSourceInfo(D->getFriendType());
Record.AddSourceLocation(D->getFriendLoc());
Code = serialization::DECL_FRIEND_TEMPLATE;
}
void ASTDeclWriter::VisitTemplateDecl(TemplateDecl *D) {
VisitNamedDecl(D);
Record.AddTemplateParameterList(D->getTemplateParameters());
Record.AddDeclRef(D->getTemplatedDecl());
}
void ASTDeclWriter::VisitConceptDecl(ConceptDecl *D) {
VisitTemplateDecl(D);
Record.AddStmt(D->getConstraintExpr());
Code = serialization::DECL_CONCEPT;
}
void ASTDeclWriter::VisitImplicitConceptSpecializationDecl(
ImplicitConceptSpecializationDecl *D) {
Record.push_back(D->getTemplateArguments().size());
VisitDecl(D);
for (const TemplateArgument &Arg : D->getTemplateArguments())
Record.AddTemplateArgument(Arg);
Code = serialization::DECL_IMPLICIT_CONCEPT_SPECIALIZATION;
}
void ASTDeclWriter::VisitRequiresExprBodyDecl(RequiresExprBodyDecl *D) {
Code = serialization::DECL_REQUIRES_EXPR_BODY;
}
void ASTDeclWriter::VisitRedeclarableTemplateDecl(RedeclarableTemplateDecl *D) {
VisitRedeclarable(D);
// Emit data to initialize CommonOrPrev before VisitTemplateDecl so that
// getCommonPtr() can be used while this is still initializing.
if (D->isFirstDecl()) {
// This declaration owns the 'common' pointer, so serialize that data now.
Record.AddDeclRef(D->getInstantiatedFromMemberTemplate());
if (D->getInstantiatedFromMemberTemplate())
Record.push_back(D->isMemberSpecialization());
}
VisitTemplateDecl(D);
Record.push_back(D->getIdentifierNamespace());
}
void ASTDeclWriter::VisitClassTemplateDecl(ClassTemplateDecl *D) {
VisitRedeclarableTemplateDecl(D);
if (D->isFirstDecl())
AddTemplateSpecializations(D);
// Force emitting the corresponding deduction guide in reduced BMI mode.
// Otherwise, the deduction guide may be optimized out incorrectly.
if (Writer.isGeneratingReducedBMI()) {
auto Name =
Record.getASTContext().DeclarationNames.getCXXDeductionGuideName(D);
for (auto *DG : D->getDeclContext()->noload_lookup(Name))
Writer.GetDeclRef(DG->getCanonicalDecl());
}
Code = serialization::DECL_CLASS_TEMPLATE;
}
void ASTDeclWriter::VisitClassTemplateSpecializationDecl(
ClassTemplateSpecializationDecl *D) {
RegisterTemplateSpecialization(D->getSpecializedTemplate(), D);
VisitCXXRecordDecl(D);
llvm::PointerUnion<ClassTemplateDecl *,
ClassTemplatePartialSpecializationDecl *> InstFrom
= D->getSpecializedTemplateOrPartial();
if (Decl *InstFromD = InstFrom.dyn_cast<ClassTemplateDecl *>()) {
Record.AddDeclRef(InstFromD);
} else {
Record.AddDeclRef(cast<ClassTemplatePartialSpecializationDecl *>(InstFrom));
Record.AddTemplateArgumentList(&D->getTemplateInstantiationArgs());
}
Record.AddTemplateArgumentList(&D->getTemplateArgs());
Record.AddSourceLocation(D->getPointOfInstantiation());
Record.push_back(D->getSpecializationKind());
Record.push_back(D->isCanonicalDecl());
if (D->isCanonicalDecl()) {
// When reading, we'll add it to the folding set of the following template.
Record.AddDeclRef(D->getSpecializedTemplate()->getCanonicalDecl());
}
bool ExplicitInstantiation =
D->getTemplateSpecializationKind() ==
TSK_ExplicitInstantiationDeclaration ||
D->getTemplateSpecializationKind() == TSK_ExplicitInstantiationDefinition;
Record.push_back(ExplicitInstantiation);
if (ExplicitInstantiation) {
Record.AddSourceLocation(D->getExternKeywordLoc());
Record.AddSourceLocation(D->getTemplateKeywordLoc());
}
const ASTTemplateArgumentListInfo *ArgsWritten =
D->getTemplateArgsAsWritten();
Record.push_back(!!ArgsWritten);
if (ArgsWritten)
Record.AddASTTemplateArgumentListInfo(ArgsWritten);
// Mention the implicitly generated C++ deduction guide to make sure the
// deduction guide will be rewritten as expected.
//
// FIXME: Would it be more efficient to add a callback register function
// in sema to register the deduction guide?
if (Writer.isWritingStdCXXNamedModules()) {
auto Name =
Record.getASTContext().DeclarationNames.getCXXDeductionGuideName(
D->getSpecializedTemplate());
for (auto *DG : D->getDeclContext()->noload_lookup(Name))
Writer.GetDeclRef(DG->getCanonicalDecl());
}
Code = serialization::DECL_CLASS_TEMPLATE_SPECIALIZATION;
}
void ASTDeclWriter::VisitClassTemplatePartialSpecializationDecl(
ClassTemplatePartialSpecializationDecl *D) {
Record.AddTemplateParameterList(D->getTemplateParameters());
VisitClassTemplateSpecializationDecl(D);
// These are read/set from/to the first declaration.
if (D->getPreviousDecl() == nullptr) {
Record.AddDeclRef(D->getInstantiatedFromMember());
Record.push_back(D->isMemberSpecialization());
}
Code = serialization::DECL_CLASS_TEMPLATE_PARTIAL_SPECIALIZATION;
}
void ASTDeclWriter::VisitVarTemplateDecl(VarTemplateDecl *D) {
VisitRedeclarableTemplateDecl(D);
if (D->isFirstDecl())
AddTemplateSpecializations(D);
Code = serialization::DECL_VAR_TEMPLATE;
}
void ASTDeclWriter::VisitVarTemplateSpecializationDecl(
VarTemplateSpecializationDecl *D) {
RegisterTemplateSpecialization(D->getSpecializedTemplate(), D);
llvm::PointerUnion<VarTemplateDecl *, VarTemplatePartialSpecializationDecl *>
InstFrom = D->getSpecializedTemplateOrPartial();
if (Decl *InstFromD = InstFrom.dyn_cast<VarTemplateDecl *>()) {
Record.AddDeclRef(InstFromD);
} else {
Record.AddDeclRef(cast<VarTemplatePartialSpecializationDecl *>(InstFrom));
Record.AddTemplateArgumentList(&D->getTemplateInstantiationArgs());
}
bool ExplicitInstantiation =
D->getTemplateSpecializationKind() ==
TSK_ExplicitInstantiationDeclaration ||
D->getTemplateSpecializationKind() == TSK_ExplicitInstantiationDefinition;
Record.push_back(ExplicitInstantiation);
if (ExplicitInstantiation) {
Record.AddSourceLocation(D->getExternKeywordLoc());
Record.AddSourceLocation(D->getTemplateKeywordLoc());
}
const ASTTemplateArgumentListInfo *ArgsWritten =
D->getTemplateArgsAsWritten();
Record.push_back(!!ArgsWritten);
if (ArgsWritten)
Record.AddASTTemplateArgumentListInfo(ArgsWritten);
Record.AddTemplateArgumentList(&D->getTemplateArgs());
Record.AddSourceLocation(D->getPointOfInstantiation());
Record.push_back(D->getSpecializationKind());
Record.push_back(D->IsCompleteDefinition);
VisitVarDecl(D);
Record.push_back(D->isCanonicalDecl());
if (D->isCanonicalDecl()) {
// When reading, we'll add it to the folding set of the following template.
Record.AddDeclRef(D->getSpecializedTemplate()->getCanonicalDecl());
}
Code = serialization::DECL_VAR_TEMPLATE_SPECIALIZATION;
}
void ASTDeclWriter::VisitVarTemplatePartialSpecializationDecl(
VarTemplatePartialSpecializationDecl *D) {
Record.AddTemplateParameterList(D->getTemplateParameters());
VisitVarTemplateSpecializationDecl(D);
// These are read/set from/to the first declaration.
if (D->getPreviousDecl() == nullptr) {
Record.AddDeclRef(D->getInstantiatedFromMember());
Record.push_back(D->isMemberSpecialization());
}
Code = serialization::DECL_VAR_TEMPLATE_PARTIAL_SPECIALIZATION;
}
void ASTDeclWriter::VisitFunctionTemplateDecl(FunctionTemplateDecl *D) {
VisitRedeclarableTemplateDecl(D);
if (D->isFirstDecl())
AddTemplateSpecializations(D);
Code = serialization::DECL_FUNCTION_TEMPLATE;
}
void ASTDeclWriter::VisitTemplateTypeParmDecl(TemplateTypeParmDecl *D) {
Record.push_back(D->hasTypeConstraint());
VisitTypeDecl(D);
Record.push_back(D->wasDeclaredWithTypename());
const TypeConstraint *TC = D->getTypeConstraint();
if (D->hasTypeConstraint())
Record.push_back(/*TypeConstraintInitialized=*/TC != nullptr);
if (TC) {
Add a concept AST node. This patch adds a concept AST node (`ConceptLoc`) and uses it at the corresponding places. There are three objects that might have constraints via concepts: `TypeConstraint`, `ConceptSpecializationExpr` and `AutoTypeLoc`. The first two inherit from `ConceptReference` while the latter has the information about a possible constraint directly stored in `AutoTypeLocInfo`. It would be nice if the concept information would be stored the same way in all three cases. Moreover the current structure makes it difficult to deal with these concepts. For example in Clangd accessing the locations of constraints of a `AutoTypeLoc` can only be done with quite ugly hacks. So we think that it makes sense to create a new AST node for such concepts. In details we propose the following: - Rename `ConceptReference` to `ConceptLoc` (or something else what is approriate) and make it the new AST node. - `TypeConstraint` and `ConceptSpecializationExpr` do not longer inherit from `ConceptReference` but store a pointer to a `ConceptLoc`. - `AutoTypeLoc` stores a pointer to `ConceptLoc` instead of storing the concept info in `AutoTypeLocInfo`. This patch implements a first version of this idea which compiles and where the existing tests pass. To make this patch as small as possible we keep the existing member functions to access concept data. Later these can be replaced by directly calling the corresponding functions of the `ConceptLoc`s. Differential Revision: https://reviews.llvm.org/D155858
2023-08-02 14:00:16 +02:00
auto *CR = TC->getConceptReference();
Record.push_back(CR != nullptr);
if (CR)
Record.AddConceptReference(CR);
Record.AddStmt(TC->getImmediatelyDeclaredConstraint());
Record.push_back(D->isExpandedParameterPack());
if (D->isExpandedParameterPack())
Record.push_back(D->getNumExpansionParameters());
}
bool OwnsDefaultArg = D->hasDefaultArgument() &&
!D->defaultArgumentWasInherited();
Record.push_back(OwnsDefaultArg);
if (OwnsDefaultArg)
Record.AddTemplateArgumentLoc(D->getDefaultArgument());
if (!D->hasTypeConstraint() && !OwnsDefaultArg &&
D->getDeclContext() == D->getLexicalDeclContext() &&
!D->isInvalidDecl() && !D->hasAttrs() &&
!D->isTopLevelDeclInObjCContainer() && !D->isImplicit() &&
D->getDeclName().getNameKind() == DeclarationName::Identifier)
AbbrevToUse = Writer.getDeclTemplateTypeParmAbbrev();
Code = serialization::DECL_TEMPLATE_TYPE_PARM;
}
void ASTDeclWriter::VisitNonTypeTemplateParmDecl(NonTypeTemplateParmDecl *D) {
// For an expanded parameter pack, record the number of expansion types here
// so that it's easier for deserialization to allocate the right amount of
// memory.
Expr *TypeConstraint = D->getPlaceholderTypeConstraint();
Record.push_back(!!TypeConstraint);
if (D->isExpandedParameterPack())
Record.push_back(D->getNumExpansionTypes());
VisitDeclaratorDecl(D);
// TemplateParmPosition.
Record.push_back(D->getDepth());
Record.push_back(D->getPosition());
if (TypeConstraint)
Record.AddStmt(TypeConstraint);
if (D->isExpandedParameterPack()) {
for (unsigned I = 0, N = D->getNumExpansionTypes(); I != N; ++I) {
Record.AddTypeRef(D->getExpansionType(I));
Record.AddTypeSourceInfo(D->getExpansionTypeSourceInfo(I));
}
Code = serialization::DECL_EXPANDED_NON_TYPE_TEMPLATE_PARM_PACK;
} else {
// Rest of NonTypeTemplateParmDecl.
Record.push_back(D->isParameterPack());
bool OwnsDefaultArg = D->hasDefaultArgument() &&
!D->defaultArgumentWasInherited();
Record.push_back(OwnsDefaultArg);
if (OwnsDefaultArg)
Record.AddTemplateArgumentLoc(D->getDefaultArgument());
Code = serialization::DECL_NON_TYPE_TEMPLATE_PARM;
}
}
void ASTDeclWriter::VisitTemplateTemplateParmDecl(TemplateTemplateParmDecl *D) {
// For an expanded parameter pack, record the number of expansion types here
// so that it's easier for deserialization to allocate the right amount of
// memory.
if (D->isExpandedParameterPack())
Record.push_back(D->getNumExpansionTemplateParameters());
VisitTemplateDecl(D);
Record.push_back(D->wasDeclaredWithTypename());
// TemplateParmPosition.
Record.push_back(D->getDepth());
Record.push_back(D->getPosition());
if (D->isExpandedParameterPack()) {
for (unsigned I = 0, N = D->getNumExpansionTemplateParameters();
I != N; ++I)
Record.AddTemplateParameterList(D->getExpansionTemplateParameters(I));
Code = serialization::DECL_EXPANDED_TEMPLATE_TEMPLATE_PARM_PACK;
} else {
// Rest of TemplateTemplateParmDecl.
Record.push_back(D->isParameterPack());
bool OwnsDefaultArg = D->hasDefaultArgument() &&
!D->defaultArgumentWasInherited();
Record.push_back(OwnsDefaultArg);
if (OwnsDefaultArg)
Record.AddTemplateArgumentLoc(D->getDefaultArgument());
Code = serialization::DECL_TEMPLATE_TEMPLATE_PARM;
}
}
void ASTDeclWriter::VisitTypeAliasTemplateDecl(TypeAliasTemplateDecl *D) {
VisitRedeclarableTemplateDecl(D);
Code = serialization::DECL_TYPE_ALIAS_TEMPLATE;
}
void ASTDeclWriter::VisitStaticAssertDecl(StaticAssertDecl *D) {
VisitDecl(D);
Record.AddStmt(D->getAssertExpr());
Record.push_back(D->isFailed());
Record.AddStmt(D->getMessage());
Record.AddSourceLocation(D->getRParenLoc());
Code = serialization::DECL_STATIC_ASSERT;
}
/// Emit the DeclContext part of a declaration context decl.
void ASTDeclWriter::VisitDeclContext(DeclContext *DC) {
static_assert(DeclContext::NumDeclContextBits == 13,
"You need to update the serializer after you change the "
"DeclContextBits");
uint64_t LexicalOffset = 0;
uint64_t VisibleOffset = 0;
if (Writer.isGeneratingReducedBMI() && isa<NamespaceDecl>(DC) &&
cast<NamespaceDecl>(DC)->isFromExplicitGlobalModule()) {
// In reduced BMI, delay writing lexical and visible block for namespace
// in the global module fragment. See the comments of DelayedNamespace for
// details.
Writer.DelayedNamespace.push_back(cast<NamespaceDecl>(DC));
} else {
LexicalOffset =
Writer.WriteDeclContextLexicalBlock(Record.getASTContext(), DC);
VisibleOffset =
Writer.WriteDeclContextVisibleBlock(Record.getASTContext(), DC);
}
Record.AddOffset(LexicalOffset);
Record.AddOffset(VisibleOffset);
}
const Decl *ASTWriter::getFirstLocalDecl(const Decl *D) {
assert(IsLocalDecl(D) && "expected a local declaration");
const Decl *Canon = D->getCanonicalDecl();
if (IsLocalDecl(Canon))
return Canon;
const Decl *&CacheEntry = FirstLocalDeclCache[Canon];
if (CacheEntry)
return CacheEntry;
for (const Decl *Redecl = D; Redecl; Redecl = Redecl->getPreviousDecl())
if (IsLocalDecl(Redecl))
D = Redecl;
return CacheEntry = D;
}
template <typename T>
void ASTDeclWriter::VisitRedeclarable(Redeclarable<T> *D) {
T *First = D->getFirstDecl();
T *MostRecent = First->getMostRecentDecl();
T *DAsT = static_cast<T *>(D);
if (MostRecent != First) {
assert(isRedeclarableDeclKind(DAsT->getKind()) &&
"Not considered redeclarable?");
Record.AddDeclRef(First);
// Write out a list of local redeclarations of this declaration if it's the
// first local declaration in the chain.
const Decl *FirstLocal = Writer.getFirstLocalDecl(DAsT);
if (DAsT == FirstLocal) {
// Emit a list of all imported first declarations so that we can be sure
// that all redeclarations visible to this module are before D in the
// redecl chain.
unsigned I = Record.size();
Record.push_back(0);
if (Writer.Chain)
AddFirstDeclFromEachModule(DAsT, /*IncludeLocal*/false);
// This is the number of imported first declarations + 1.
Record[I] = Record.size() - I;
// Collect the set of local redeclarations of this declaration, from
// newest to oldest.
ASTWriter::RecordData LocalRedecls;
ASTRecordWriter LocalRedeclWriter(Record, LocalRedecls);
for (const Decl *Prev = FirstLocal->getMostRecentDecl();
Prev != FirstLocal; Prev = Prev->getPreviousDecl())
if (!Prev->isFromASTFile())
LocalRedeclWriter.AddDeclRef(Prev);
// If we have any redecls, write them now as a separate record preceding
// the declaration itself.
if (LocalRedecls.empty())
Record.push_back(0);
else
Record.AddOffset(LocalRedeclWriter.Emit(LOCAL_REDECLARATIONS));
} else {
Record.push_back(0);
Record.AddDeclRef(FirstLocal);
}
// Make sure that we serialize both the previous and the most-recent
// declarations, which (transitively) ensures that all declarations in the
// chain get serialized.
//
// FIXME: This is not correct; when we reach an imported declaration we
// won't emit its previous declaration.
(void)Writer.GetDeclRef(D->getPreviousDecl());
(void)Writer.GetDeclRef(MostRecent);
} else {
// We use the sentinel value 0 to indicate an only declaration.
Record.push_back(0);
}
}
void ASTDeclWriter::VisitHLSLBufferDecl(HLSLBufferDecl *D) {
VisitNamedDecl(D);
VisitDeclContext(D);
Record.push_back(D->isCBuffer());
Record.AddSourceLocation(D->getLocStart());
Record.AddSourceLocation(D->getLBraceLoc());
Record.AddSourceLocation(D->getRBraceLoc());
Code = serialization::DECL_HLSL_BUFFER;
}
void ASTDeclWriter::VisitOMPThreadPrivateDecl(OMPThreadPrivateDecl *D) {
Record.writeOMPChildren(D->Data);
VisitDecl(D);
Code = serialization::DECL_OMP_THREADPRIVATE;
}
void ASTDeclWriter::VisitOMPAllocateDecl(OMPAllocateDecl *D) {
Record.writeOMPChildren(D->Data);
VisitDecl(D);
Code = serialization::DECL_OMP_ALLOCATE;
}
void ASTDeclWriter::VisitOMPRequiresDecl(OMPRequiresDecl *D) {
Record.writeOMPChildren(D->Data);
VisitDecl(D);
Code = serialization::DECL_OMP_REQUIRES;
}
void ASTDeclWriter::VisitOMPDeclareReductionDecl(OMPDeclareReductionDecl *D) {
static_assert(DeclContext::NumOMPDeclareReductionDeclBits == 15,
"You need to update the serializer after you change the "
"NumOMPDeclareReductionDeclBits");
VisitValueDecl(D);
Record.AddSourceLocation(D->getBeginLoc());
Record.AddStmt(D->getCombinerIn());
Record.AddStmt(D->getCombinerOut());
Record.AddStmt(D->getCombiner());
Record.AddStmt(D->getInitOrig());
Record.AddStmt(D->getInitPriv());
Record.AddStmt(D->getInitializer());
Record.push_back(llvm::to_underlying(D->getInitializerKind()));
Record.AddDeclRef(D->getPrevDeclInScope());
Code = serialization::DECL_OMP_DECLARE_REDUCTION;
}
void ASTDeclWriter::VisitOMPDeclareMapperDecl(OMPDeclareMapperDecl *D) {
Record.writeOMPChildren(D->Data);
VisitValueDecl(D);
Record.AddDeclarationName(D->getVarName());
Record.AddDeclRef(D->getPrevDeclInScope());
Code = serialization::DECL_OMP_DECLARE_MAPPER;
}
void ASTDeclWriter::VisitOMPCapturedExprDecl(OMPCapturedExprDecl *D) {
VisitVarDecl(D);
Code = serialization::DECL_OMP_CAPTUREDEXPR;
}
//===----------------------------------------------------------------------===//
// ASTWriter Implementation
//===----------------------------------------------------------------------===//
namespace {
template <FunctionDecl::TemplatedKind Kind>
std::shared_ptr<llvm::BitCodeAbbrev>
getFunctionDeclAbbrev(serialization::DeclCode Code) {
using namespace llvm;
auto Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(Code));
// RedeclarableDecl
Abv->Add(BitCodeAbbrevOp(0)); // CanonicalDecl
Abv->Add(BitCodeAbbrevOp(Kind));
if constexpr (Kind == FunctionDecl::TK_NonTemplate) {
} else if constexpr (Kind == FunctionDecl::TK_FunctionTemplate) {
// DescribedFunctionTemplate
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
} else if constexpr (Kind == FunctionDecl::TK_DependentNonTemplate) {
// Instantiated From Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
} else if constexpr (Kind == FunctionDecl::TK_MemberSpecialization) {
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // InstantiatedFrom
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
3)); // TemplateSpecializationKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Specialized Location
} else if constexpr (Kind ==
FunctionDecl::TK_FunctionTemplateSpecialization) {
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Template
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
3)); // TemplateSpecializationKind
Abv->Add(BitCodeAbbrevOp(1)); // Template Argument Size
Abv->Add(BitCodeAbbrevOp(TemplateArgument::Type)); // Template Argument Kind
Abv->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Template Argument Type
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // Is Defaulted
Abv->Add(BitCodeAbbrevOp(0)); // TemplateArgumentsAsWritten
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SourceLocation
Abv->Add(BitCodeAbbrevOp(0));
Abv->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Canonical Decl of template
} else if constexpr (Kind == FunctionDecl::
TK_DependentFunctionTemplateSpecialization) {
// Candidates of specialization
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abv->Add(BitCodeAbbrevOp(0)); // TemplateArgumentsAsWritten
} else {
llvm_unreachable("Unknown templated kind?");
}
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
8)); // Packed DeclBits: ModuleOwnershipKind,
// isUsed, isReferenced, AccessSpecifier,
// isImplicit
//
// The following bits should be 0:
// HasStandaloneLexicalDC, HasAttrs,
// TopLevelDeclInObjCContainer,
// isInvalidDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(DeclarationName::Identifier)); // NameKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Identifier
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// ValueDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// DeclaratorDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // InnerLocStart
Abv->Add(BitCodeAbbrevOp(0)); // HasExtInfo
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TSIType
// FunctionDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 11)); // IDNS
Abv->Add(BitCodeAbbrevOp(
BitCodeAbbrevOp::Fixed,
[Serialization] Record whether the ODR is skipped (#82302) Close https://github.com/llvm/llvm-project/issues/80570. In https://github.com/llvm/llvm-project/commit/a0b6747804e46665ecfd00295b60432bfe1775b6, we skipped ODR checks for decls in GMF. Then it should be natural to skip storing the ODR values in BMI. Generally it should be fine as long as the writer and the reader keep consistent. However, the use of preamble in clangd shows the tricky part. For, ``` // test.cpp module; // any one off these is enough to crash clangd // #include <iostream> // #include <string_view> // #include <cmath> // #include <system_error> // #include <new> // #include <bit> // probably many more // only ok with libc++, not the system provided libstdc++ 13.2.1 // these are ok export module test; ``` clangd will store the headers as preamble to speedup the parsing and the preamble reuses the serialization techniques. (Generally we'd call the preamble as PCH. However it is not true strictly. I've tested the PCH wouldn't be problematic.) However, the tricky part is that the preamble is not modules. It literally serialiaze and deserialize things. So before clangd parsing the above test module, clangd will serialize the headers into the preamble. Note that there is no concept like GMF now. So the ODR bits are stored. However, when clangd parse the file actually, the decls from preamble are thought as in GMF literally, then hte ODR bits are skipped. Then mismatch happens. To solve the problem, this patch adds another bit for decls to record whether or not the ODR bits are skipped.
2024-02-20 13:31:28 +08:00
28)); // Packed Function Bits: StorageClass, Inline, InlineSpecified,
// VirtualAsWritten, Pure, HasInheritedProto, HasWrittenProto,
// Deleted, Trivial, TrivialForCall, Defaulted, ExplicitlyDefaulted,
// IsIneligibleOrNotSelected, ImplicitReturnZero, Constexpr,
// UsesSEHTry, SkippedBody, MultiVersion, LateParsed,
[Serialization] Record whether the ODR is skipped (#82302) Close https://github.com/llvm/llvm-project/issues/80570. In https://github.com/llvm/llvm-project/commit/a0b6747804e46665ecfd00295b60432bfe1775b6, we skipped ODR checks for decls in GMF. Then it should be natural to skip storing the ODR values in BMI. Generally it should be fine as long as the writer and the reader keep consistent. However, the use of preamble in clangd shows the tricky part. For, ``` // test.cpp module; // any one off these is enough to crash clangd // #include <iostream> // #include <string_view> // #include <cmath> // #include <system_error> // #include <new> // #include <bit> // probably many more // only ok with libc++, not the system provided libstdc++ 13.2.1 // these are ok export module test; ``` clangd will store the headers as preamble to speedup the parsing and the preamble reuses the serialization techniques. (Generally we'd call the preamble as PCH. However it is not true strictly. I've tested the PCH wouldn't be problematic.) However, the tricky part is that the preamble is not modules. It literally serialiaze and deserialize things. So before clangd parsing the above test module, clangd will serialize the headers into the preamble. Note that there is no concept like GMF now. So the ODR bits are stored. However, when clangd parse the file actually, the decls from preamble are thought as in GMF literally, then hte ODR bits are skipped. Then mismatch happens. To solve the problem, this patch adds another bit for decls to record whether or not the ODR bits are skipped.
2024-02-20 13:31:28 +08:00
// FriendConstraintRefersToEnclosingTemplate, Linkage,
// ShouldSkipCheckingODR
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // LocEnd
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32)); // ODRHash
// This Array slurps the rest of the record. Fortunately we want to encode
// (nearly) all the remaining (variable number of) fields in the same way.
//
// This is:
// NumParams and Params[] from FunctionDecl, and
// NumOverriddenMethods, OverriddenMethods[] from CXXMethodDecl.
//
// Add an AbbrevOp for 'size then elements' and use it here.
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6));
return Abv;
}
template <FunctionDecl::TemplatedKind Kind>
std::shared_ptr<llvm::BitCodeAbbrev> getCXXMethodAbbrev() {
return getFunctionDeclAbbrev<Kind>(serialization::DECL_CXX_METHOD);
}
} // namespace
void ASTWriter::WriteDeclAbbrevs() {
using namespace llvm;
std::shared_ptr<BitCodeAbbrev> Abv;
// Abbreviation for DECL_FIELD
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_FIELD));
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
7)); // Packed DeclBits: ModuleOwnershipKind,
// isUsed, isReferenced, AccessSpecifier,
//
// The following bits should be 0:
// isImplicit, HasStandaloneLexicalDC, HasAttrs,
// TopLevelDeclInObjCContainer,
// isInvalidDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// ValueDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// DeclaratorDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // InnerStartLoc
Abv->Add(BitCodeAbbrevOp(0)); // hasExtInfo
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TSIType
// FieldDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // isMutable
Abv->Add(BitCodeAbbrevOp(0)); // StorageKind
// Type Source Info
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TypeLoc
DeclFieldAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_OBJC_IVAR
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_OBJC_IVAR));
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
12)); // Packed DeclBits: HasStandaloneLexicalDC,
// isInvalidDecl, HasAttrs, isImplicit, isUsed,
// isReferenced, TopLevelDeclInObjCContainer,
// AccessSpecifier, ModuleOwnershipKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// ValueDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// DeclaratorDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // InnerStartLoc
Abv->Add(BitCodeAbbrevOp(0)); // hasExtInfo
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TSIType
// FieldDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // isMutable
Abv->Add(BitCodeAbbrevOp(0)); // InitStyle
// ObjC Ivar
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // getAccessControl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // getSynthesize
// Type Source Info
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TypeLoc
DeclObjCIvarAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_ENUM
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_ENUM));
// Redeclarable
Abv->Add(BitCodeAbbrevOp(0)); // No redeclaration
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
7)); // Packed DeclBits: ModuleOwnershipKind,
// isUsed, isReferenced, AccessSpecifier,
//
// The following bits should be 0:
// isImplicit, HasStandaloneLexicalDC, HasAttrs,
// TopLevelDeclInObjCContainer,
// isInvalidDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// TypeDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type Ref
// TagDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // IdentifierNamespace
Abv->Add(BitCodeAbbrevOp(
BitCodeAbbrevOp::Fixed,
9)); // Packed Tag Decl Bits: getTagKind, isCompleteDefinition,
// EmbeddedInDeclarator, IsFreeStanding,
// isCompleteDefinitionRequired, ExtInfoKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SourceLocation
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SourceLocation
// EnumDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // AddTypeRef
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // IntegerType
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // getPromotionType
[Serialization] Record whether the ODR is skipped (#82302) Close https://github.com/llvm/llvm-project/issues/80570. In https://github.com/llvm/llvm-project/commit/a0b6747804e46665ecfd00295b60432bfe1775b6, we skipped ODR checks for decls in GMF. Then it should be natural to skip storing the ODR values in BMI. Generally it should be fine as long as the writer and the reader keep consistent. However, the use of preamble in clangd shows the tricky part. For, ``` // test.cpp module; // any one off these is enough to crash clangd // #include <iostream> // #include <string_view> // #include <cmath> // #include <system_error> // #include <new> // #include <bit> // probably many more // only ok with libc++, not the system provided libstdc++ 13.2.1 // these are ok export module test; ``` clangd will store the headers as preamble to speedup the parsing and the preamble reuses the serialization techniques. (Generally we'd call the preamble as PCH. However it is not true strictly. I've tested the PCH wouldn't be problematic.) However, the tricky part is that the preamble is not modules. It literally serialiaze and deserialize things. So before clangd parsing the above test module, clangd will serialize the headers into the preamble. Note that there is no concept like GMF now. So the ODR bits are stored. However, when clangd parse the file actually, the decls from preamble are thought as in GMF literally, then hte ODR bits are skipped. Then mismatch happens. To solve the problem, this patch adds another bit for decls to record whether or not the ODR bits are skipped.
2024-02-20 13:31:28 +08:00
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 20)); // Enum Decl Bits
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 32));// ODRHash
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // InstantiatedMembEnum
// DC
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // LexicalOffset
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // VisibleOffset
DeclEnumAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_RECORD
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_RECORD));
// Redeclarable
Abv->Add(BitCodeAbbrevOp(0)); // No redeclaration
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
7)); // Packed DeclBits: ModuleOwnershipKind,
// isUsed, isReferenced, AccessSpecifier,
//
// The following bits should be 0:
// isImplicit, HasStandaloneLexicalDC, HasAttrs,
// TopLevelDeclInObjCContainer,
// isInvalidDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// TypeDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type Ref
// TagDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // IdentifierNamespace
Abv->Add(BitCodeAbbrevOp(
BitCodeAbbrevOp::Fixed,
9)); // Packed Tag Decl Bits: getTagKind, isCompleteDefinition,
// EmbeddedInDeclarator, IsFreeStanding,
// isCompleteDefinitionRequired, ExtInfoKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SourceLocation
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SourceLocation
// RecordDecl
Abv->Add(BitCodeAbbrevOp(
BitCodeAbbrevOp::Fixed,
13)); // Packed Record Decl Bits: FlexibleArrayMember,
// AnonymousStructUnion, hasObjectMember, hasVolatileMember,
// isNonTrivialToPrimitiveDefaultInitialize,
// isNonTrivialToPrimitiveCopy, isNonTrivialToPrimitiveDestroy,
// hasNonTrivialToPrimitiveDefaultInitializeCUnion,
// hasNonTrivialToPrimitiveDestructCUnion,
// hasNonTrivialToPrimitiveCopyCUnion,
// hasUninitializedExplicitInitFields, isParamDestroyedInCallee,
// getArgPassingRestrictions
// ODRHash
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 26));
// DC
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // LexicalOffset
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // VisibleOffset
DeclRecordAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_PARM_VAR
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_PARM_VAR));
// Redeclarable
Abv->Add(BitCodeAbbrevOp(0)); // No redeclaration
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
8)); // Packed DeclBits: ModuleOwnershipKind, isUsed,
// isReferenced, AccessSpecifier,
// HasStandaloneLexicalDC, HasAttrs, isImplicit,
// TopLevelDeclInObjCContainer,
// isInvalidDecl,
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// ValueDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// DeclaratorDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // InnerStartLoc
Abv->Add(BitCodeAbbrevOp(0)); // hasExtInfo
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TSIType
// VarDecl
Abv->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
12)); // Packed Var Decl bits: SClass, TSCSpec, InitStyle,
// isARCPseudoStrong, Linkage, ModulesCodegen
Abv->Add(BitCodeAbbrevOp(0)); // VarKind (local enum)
// ParmVarDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // ScopeIndex
Abv->Add(BitCodeAbbrevOp(
BitCodeAbbrevOp::Fixed,
19)); // Packed Parm Var Decl bits: IsObjCMethodParameter, ScopeDepth,
// ObjCDeclQualifier, KNRPromoted,
// HasInheritedDefaultArg, HasUninstantiatedDefaultArg
// Type Source Info
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TypeLoc
DeclParmVarAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_TYPEDEF
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_TYPEDEF));
// Redeclarable
Abv->Add(BitCodeAbbrevOp(0)); // No redeclaration
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
7)); // Packed DeclBits: ModuleOwnershipKind,
// isReferenced, isUsed, AccessSpecifier. Other
// higher bits should be 0: isImplicit,
// HasStandaloneLexicalDC, HasAttrs,
// TopLevelDeclInObjCContainer, isInvalidDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// TypeDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type Ref
// TypedefDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TypeLoc
DeclTypedefAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_VAR
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_VAR));
// Redeclarable
Abv->Add(BitCodeAbbrevOp(0)); // No redeclaration
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
12)); // Packed DeclBits: HasStandaloneLexicalDC,
// isInvalidDecl, HasAttrs, isImplicit, isUsed,
// isReferenced, TopLevelDeclInObjCContainer,
// AccessSpecifier, ModuleOwnershipKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0)); // AnonDeclNumber
// ValueDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// DeclaratorDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // InnerStartLoc
Abv->Add(BitCodeAbbrevOp(0)); // hasExtInfo
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TSIType
// VarDecl
Abv->Add(BitCodeAbbrevOp(
BitCodeAbbrevOp::Fixed,
21)); // Packed Var Decl bits: Linkage, ModulesCodegen,
// SClass, TSCSpec, InitStyle,
// isARCPseudoStrong, IsThisDeclarationADemotedDefinition,
// isExceptionVariable, isNRVOVariable, isCXXForRangeDecl,
// isInline, isInlineSpecified, isConstexpr,
// isInitCapture, isPrevDeclInSameScope,
// EscapingByref, HasDeducedType, ImplicitParamKind, isObjCForDecl
Abv->Add(BitCodeAbbrevOp(0)); // VarKind (local enum)
// Type Source Info
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Array));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TypeLoc
DeclVarAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_CXX_METHOD
DeclCXXMethodAbbrev =
Stream.EmitAbbrev(getCXXMethodAbbrev<FunctionDecl::TK_NonTemplate>());
DeclTemplateCXXMethodAbbrev = Stream.EmitAbbrev(
getCXXMethodAbbrev<FunctionDecl::TK_FunctionTemplate>());
DeclDependentNonTemplateCXXMethodAbbrev = Stream.EmitAbbrev(
getCXXMethodAbbrev<FunctionDecl::TK_DependentNonTemplate>());
DeclMemberSpecializedCXXMethodAbbrev = Stream.EmitAbbrev(
getCXXMethodAbbrev<FunctionDecl::TK_MemberSpecialization>());
DeclTemplateSpecializedCXXMethodAbbrev = Stream.EmitAbbrev(
getCXXMethodAbbrev<FunctionDecl::TK_FunctionTemplateSpecialization>());
DeclDependentSpecializationCXXMethodAbbrev = Stream.EmitAbbrev(
getCXXMethodAbbrev<
FunctionDecl::TK_DependentFunctionTemplateSpecialization>());
// Abbreviation for DECL_TEMPLATE_TYPE_PARM
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_TEMPLATE_TYPE_PARM));
Abv->Add(BitCodeAbbrevOp(0)); // hasTypeConstraint
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
7)); // Packed DeclBits: ModuleOwnershipKind,
// isReferenced, isUsed, AccessSpecifier. Other
// higher bits should be 0: isImplicit,
// HasStandaloneLexicalDC, HasAttrs,
// TopLevelDeclInObjCContainer, isInvalidDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0));
// TypeDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type Ref
// TemplateTypeParmDecl
Abv->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 1)); // wasDeclaredWithTypename
Abv->Add(BitCodeAbbrevOp(0)); // OwnsDefaultArg
DeclTemplateTypeParmAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for DECL_USING_SHADOW
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_USING_SHADOW));
// Redeclarable
Abv->Add(BitCodeAbbrevOp(0)); // No redeclaration
// Decl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed,
12)); // Packed DeclBits: HasStandaloneLexicalDC,
// isInvalidDecl, HasAttrs, isImplicit, isUsed,
// isReferenced, TopLevelDeclInObjCContainer,
// AccessSpecifier, ModuleOwnershipKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclContext
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // SubmoduleID
// NamedDecl
Abv->Add(BitCodeAbbrevOp(0)); // NameKind = Identifier
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Name
Abv->Add(BitCodeAbbrevOp(0));
// UsingShadowDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // TargetDecl
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 11)); // IDNS
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // UsingOrNextShadow
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR,
6)); // InstantiatedFromUsingShadowDecl
DeclUsingShadowAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_DECL_REF
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_DECL_REF));
// Stmt
// Expr
// PackingBits: DependenceKind, ValueKind. ObjectKind should be 0.
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 7));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// DeclRefExpr
// Packing Bits: , HadMultipleCandidates, RefersToEnclosingVariableOrCapture,
// IsImmediateEscalating, NonOdrUseReason.
// GetDeclFound, HasQualifier and ExplicitTemplateArgs should be 0.
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // DeclRef
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Location
DeclRefExprAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_INTEGER_LITERAL
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_INTEGER_LITERAL));
//Stmt
[clang][OpeMP] Model OpenMP structured-block in AST (PR40563) Summary: https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf, page 3: ``` structured block For C/C++, an executable statement, possibly compound, with a single entry at the top and a single exit at the bottom, or an OpenMP construct. COMMENT: See Section 2.1 on page 38 for restrictions on structured blocks. ``` ``` 2.1 Directive Format Some executable directives include a structured block. A structured block: • may contain infinite loops where the point of exit is never reached; • may halt due to an IEEE exception; • may contain calls to exit(), _Exit(), quick_exit(), abort() or functions with a _Noreturn specifier (in C) or a noreturn attribute (in C/C++); • may be an expression statement, iteration statement, selection statement, or try block, provided that the corresponding compound statement obtained by enclosing it in { and } would be a structured block; and Restrictions Restrictions to structured blocks are as follows: • Entry to a structured block must not be the result of a branch. • The point of exit cannot be a branch out of the structured block. C / C++ • The point of entry to a structured block must not be a call to setjmp(). • longjmp() and throw() must not violate the entry/exit criteria. ``` Of particular note here is the fact that OpenMP structured blocks are as-if `noexcept`, in the same sense as with the normal `noexcept` functions in C++. I.e. if throw happens, and it attempts to travel out of the `noexcept` function (here: out of the current structured-block), then the program terminates. Now, one of course can say that since it is explicitly prohibited by the Specification, then any and all programs that violate this Specification contain undefined behavior, and are unspecified, and thus no one should care about them. Just don't write broken code /s But i'm not sure this is a reasonable approach. I have personally had oss-fuzz issues of this origin - exception thrown inside of an OpenMP structured-block that is not caught, thus causing program termination. This issue isn't all that hard to catch, it's not any particularly different from diagnosing the same situation with the normal `noexcept` function. Now, clang static analyzer does not presently model exceptions. But clang-tidy has a simplisic [[ https://clang.llvm.org/extra/clang-tidy/checks/bugprone-exception-escape.html | bugprone-exception-escape ]] check, and it is even refactored as a `ExceptionAnalyzer` class for reuse. So it would be trivial to use that analyzer to check for exceptions escaping out of OpenMP structured blocks. (D59466) All that sounds too great to be true. Indeed, there is a caveat. Presently, it's practically impossible to do. To check a OpenMP structured block you need to somehow 'get' the OpenMP structured block, and you can't because it's simply not modelled in AST. `CapturedStmt`/`CapturedDecl` is not it's representation. Now, it is of course possible to write e.g. some AST matcher that would e.g. match every OpenMP executable directive, and then return the whatever `Stmt` is the structured block of said executable directive, if any. But i said //practically//. This isn't practical for the following reasons: 1. This **will** bitrot. That matcher will need to be kept up-to-date, and refreshed with every new OpenMP spec version. 2. Every single piece of code that would want that knowledge would need to have such matcher. Well, okay, if it is an AST matcher, it could be shared. But then you still have `RecursiveASTVisitor` and friends. `2 > 1`, so now you have code duplication. So it would be reasonable (and is fully within clang AST spirit) to not force every single consumer to do that work, but instead store that knowledge in the correct, and appropriate place - AST, class structure. Now, there is another hoop we need to get through. It isn't fully obvious //how// to model this. The best solution would of course be to simply add a `OMPStructuredBlock` transparent node. It would be optimal, it would give us two properties: * Given this `OMPExecutableDirective`, what's it OpenMP structured block? * It is trivial to check whether the `Stmt*` is a OpenMP structured block (`isa<OMPStructuredBlock>(ptr)`) But OpenMP structured block isn't **necessarily** the first, direct child of `OMP*Directive`. (even ignoring the clang's `CapturedStmt`/`CapturedDecl` that were inserted inbetween). So i'm not sure whether or not we could re-create AST statements after they were already created? There would be other costs to a new AST node: https://bugs.llvm.org/show_bug.cgi?id=40563#c12 ``` 1. You will need to break the representation of loops. The body should be replaced by the "structured block" entity. 2. You will need to support serialization/deserialization. 3. You will need to support template instantiation. 4. You will need to support codegen and take this new construct to account in each OpenMP directive. ``` Instead, there **is** an functionally-equivalent, alternative solution, consisting of two parts. Part 1: * Add a member function `isStandaloneDirective()` to the `OMPExecutableDirective` class, that will tell whether this directive is stand-alone or not, as per the spec. We need it because we can't just check for the existance of associated statements, see code comment. * Add a member function `getStructuredBlock()` to the OMPExecutableDirective` class itself, that assert that this is not a stand-alone directive, and either return the correct loop body if this is a loop-like directive, or the captured statement. This way, given an `OMPExecutableDirective`, we can get it's structured block. Also, since the knowledge is ingrained into the clang OpenMP implementation, it will not cause any duplication, and //hopefully// won't bitrot. Great we achieved 1 of 2 properties of `OMPStructuredBlock` approach. Thus, there is a second part needed: * How can we check whether a given `Stmt*` is `OMPStructuredBlock`? Well, we can't really, in general. I can see this workaround: ``` class FunctionASTVisitor : public RecursiveASTVisitor<FunctionASTVisitor> { using Base = RecursiveASTVisitor<FunctionASTVisitor>; public: bool VisitOMPExecDir(OMPExecDir *D) { OmpStructuredStmts.emplace_back(D.getStructuredStmt()); } bool VisitSOMETHINGELSE(???) { if(InOmpStructuredStmt) HI! } bool TraverseStmt(Stmt *Node) { if (!Node) return Base::TraverseStmt(Node); if (OmpStructuredStmts.back() == Node) ++InOmpStructuredStmt; Base::TraverseStmt(Node); if (OmpStructuredStmts.back() == Node) { OmpStructuredStmts.pop_back(); --InOmpStructuredStmt; } return true; } std::vector<Stmt*> OmpStructuredStmts; int InOmpStructuredStmt = 0; }; ``` But i really don't see using it in practice. It's just too intrusive; and again, requires knowledge duplication. .. but no. The solution lies right on the ground. Why don't we simply store this `i'm a openmp structured block` in the bitfield of the `Stmt` itself? This does not appear to have any impact on the memory footprint of the clang AST, since it's just a single extra bit in the bitfield. At least the static assertions don't fail. Thus, indeed, we can achieve both of the properties without a new AST node. We can cheaply set that bit right in sema, at the end of `Sema::ActOnOpenMPExecutableDirective()`, by just calling the `getStructuredBlock()` that we just added. Test coverage that demonstrates all this has been added. This isn't as great with serialization though. Most of it does not use abbrevs, so we do end up paying the full price (4 bytes?) instead of a single bit. That price, of course, can be reclaimed by using abbrevs. In fact, i suspect that //might// not just reclaim these bytes, but pack these PCH significantly. I'm not seeing a third solution. If there is one, it would be interesting to hear about it. ("just don't write code that would require `isa<OMPStructuredBlock>(ptr)`" is not a solution.) Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40563 | PR40563 ]]. Reviewers: ABataev, rjmccall, hfinkel, rsmith, riccibruno, gribozavr Reviewed By: ABataev, gribozavr Subscribers: mgorny, aaron.ballman, steveire, guansong, jfb, jdoerfert, cfe-commits Tags: #clang, #openmp Differential Revision: https://reviews.llvm.org/D59214 llvm-svn: 356570
2019-03-20 16:32:36 +00:00
// Expr
// DependenceKind, ValueKind, ObjectKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 10));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// Integer Literal
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Location
Abv->Add(BitCodeAbbrevOp(32)); // Bit Width
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Value
IntegerLiteralAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_CHARACTER_LITERAL
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_CHARACTER_LITERAL));
//Stmt
[clang][OpeMP] Model OpenMP structured-block in AST (PR40563) Summary: https://www.openmp.org/wp-content/uploads/OpenMP-API-Specification-5.0.pdf, page 3: ``` structured block For C/C++, an executable statement, possibly compound, with a single entry at the top and a single exit at the bottom, or an OpenMP construct. COMMENT: See Section 2.1 on page 38 for restrictions on structured blocks. ``` ``` 2.1 Directive Format Some executable directives include a structured block. A structured block: • may contain infinite loops where the point of exit is never reached; • may halt due to an IEEE exception; • may contain calls to exit(), _Exit(), quick_exit(), abort() or functions with a _Noreturn specifier (in C) or a noreturn attribute (in C/C++); • may be an expression statement, iteration statement, selection statement, or try block, provided that the corresponding compound statement obtained by enclosing it in { and } would be a structured block; and Restrictions Restrictions to structured blocks are as follows: • Entry to a structured block must not be the result of a branch. • The point of exit cannot be a branch out of the structured block. C / C++ • The point of entry to a structured block must not be a call to setjmp(). • longjmp() and throw() must not violate the entry/exit criteria. ``` Of particular note here is the fact that OpenMP structured blocks are as-if `noexcept`, in the same sense as with the normal `noexcept` functions in C++. I.e. if throw happens, and it attempts to travel out of the `noexcept` function (here: out of the current structured-block), then the program terminates. Now, one of course can say that since it is explicitly prohibited by the Specification, then any and all programs that violate this Specification contain undefined behavior, and are unspecified, and thus no one should care about them. Just don't write broken code /s But i'm not sure this is a reasonable approach. I have personally had oss-fuzz issues of this origin - exception thrown inside of an OpenMP structured-block that is not caught, thus causing program termination. This issue isn't all that hard to catch, it's not any particularly different from diagnosing the same situation with the normal `noexcept` function. Now, clang static analyzer does not presently model exceptions. But clang-tidy has a simplisic [[ https://clang.llvm.org/extra/clang-tidy/checks/bugprone-exception-escape.html | bugprone-exception-escape ]] check, and it is even refactored as a `ExceptionAnalyzer` class for reuse. So it would be trivial to use that analyzer to check for exceptions escaping out of OpenMP structured blocks. (D59466) All that sounds too great to be true. Indeed, there is a caveat. Presently, it's practically impossible to do. To check a OpenMP structured block you need to somehow 'get' the OpenMP structured block, and you can't because it's simply not modelled in AST. `CapturedStmt`/`CapturedDecl` is not it's representation. Now, it is of course possible to write e.g. some AST matcher that would e.g. match every OpenMP executable directive, and then return the whatever `Stmt` is the structured block of said executable directive, if any. But i said //practically//. This isn't practical for the following reasons: 1. This **will** bitrot. That matcher will need to be kept up-to-date, and refreshed with every new OpenMP spec version. 2. Every single piece of code that would want that knowledge would need to have such matcher. Well, okay, if it is an AST matcher, it could be shared. But then you still have `RecursiveASTVisitor` and friends. `2 > 1`, so now you have code duplication. So it would be reasonable (and is fully within clang AST spirit) to not force every single consumer to do that work, but instead store that knowledge in the correct, and appropriate place - AST, class structure. Now, there is another hoop we need to get through. It isn't fully obvious //how// to model this. The best solution would of course be to simply add a `OMPStructuredBlock` transparent node. It would be optimal, it would give us two properties: * Given this `OMPExecutableDirective`, what's it OpenMP structured block? * It is trivial to check whether the `Stmt*` is a OpenMP structured block (`isa<OMPStructuredBlock>(ptr)`) But OpenMP structured block isn't **necessarily** the first, direct child of `OMP*Directive`. (even ignoring the clang's `CapturedStmt`/`CapturedDecl` that were inserted inbetween). So i'm not sure whether or not we could re-create AST statements after they were already created? There would be other costs to a new AST node: https://bugs.llvm.org/show_bug.cgi?id=40563#c12 ``` 1. You will need to break the representation of loops. The body should be replaced by the "structured block" entity. 2. You will need to support serialization/deserialization. 3. You will need to support template instantiation. 4. You will need to support codegen and take this new construct to account in each OpenMP directive. ``` Instead, there **is** an functionally-equivalent, alternative solution, consisting of two parts. Part 1: * Add a member function `isStandaloneDirective()` to the `OMPExecutableDirective` class, that will tell whether this directive is stand-alone or not, as per the spec. We need it because we can't just check for the existance of associated statements, see code comment. * Add a member function `getStructuredBlock()` to the OMPExecutableDirective` class itself, that assert that this is not a stand-alone directive, and either return the correct loop body if this is a loop-like directive, or the captured statement. This way, given an `OMPExecutableDirective`, we can get it's structured block. Also, since the knowledge is ingrained into the clang OpenMP implementation, it will not cause any duplication, and //hopefully// won't bitrot. Great we achieved 1 of 2 properties of `OMPStructuredBlock` approach. Thus, there is a second part needed: * How can we check whether a given `Stmt*` is `OMPStructuredBlock`? Well, we can't really, in general. I can see this workaround: ``` class FunctionASTVisitor : public RecursiveASTVisitor<FunctionASTVisitor> { using Base = RecursiveASTVisitor<FunctionASTVisitor>; public: bool VisitOMPExecDir(OMPExecDir *D) { OmpStructuredStmts.emplace_back(D.getStructuredStmt()); } bool VisitSOMETHINGELSE(???) { if(InOmpStructuredStmt) HI! } bool TraverseStmt(Stmt *Node) { if (!Node) return Base::TraverseStmt(Node); if (OmpStructuredStmts.back() == Node) ++InOmpStructuredStmt; Base::TraverseStmt(Node); if (OmpStructuredStmts.back() == Node) { OmpStructuredStmts.pop_back(); --InOmpStructuredStmt; } return true; } std::vector<Stmt*> OmpStructuredStmts; int InOmpStructuredStmt = 0; }; ``` But i really don't see using it in practice. It's just too intrusive; and again, requires knowledge duplication. .. but no. The solution lies right on the ground. Why don't we simply store this `i'm a openmp structured block` in the bitfield of the `Stmt` itself? This does not appear to have any impact on the memory footprint of the clang AST, since it's just a single extra bit in the bitfield. At least the static assertions don't fail. Thus, indeed, we can achieve both of the properties without a new AST node. We can cheaply set that bit right in sema, at the end of `Sema::ActOnOpenMPExecutableDirective()`, by just calling the `getStructuredBlock()` that we just added. Test coverage that demonstrates all this has been added. This isn't as great with serialization though. Most of it does not use abbrevs, so we do end up paying the full price (4 bytes?) instead of a single bit. That price, of course, can be reclaimed by using abbrevs. In fact, i suspect that //might// not just reclaim these bytes, but pack these PCH significantly. I'm not seeing a third solution. If there is one, it would be interesting to hear about it. ("just don't write code that would require `isa<OMPStructuredBlock>(ptr)`" is not a solution.) Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=40563 | PR40563 ]]. Reviewers: ABataev, rjmccall, hfinkel, rsmith, riccibruno, gribozavr Reviewed By: ABataev, gribozavr Subscribers: mgorny, aaron.ballman, steveire, guansong, jfb, jdoerfert, cfe-commits Tags: #clang, #openmp Differential Revision: https://reviews.llvm.org/D59214 llvm-svn: 356570
2019-03-20 16:32:36 +00:00
// Expr
// DependenceKind, ValueKind, ObjectKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 10));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// Character Literal
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // getValue
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Location
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 3)); // getKind
CharacterLiteralAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_IMPLICIT_CAST
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_IMPLICIT_CAST));
// Stmt
// Expr
// Packing Bits: DependenceKind, ValueKind, ObjectKind,
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 10));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// CastExpr
Abv->Add(BitCodeAbbrevOp(0)); // PathSize
// Packing Bits: CastKind, StoredFPFeatures, isPartOfExplicitCast
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 9));
// ImplicitCastExpr
ExprImplicitCastAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_BINARY_OPERATOR
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_BINARY_OPERATOR));
// Stmt
// Expr
// Packing Bits: DependenceKind. ValueKind and ObjectKind should
// be 0 in this case.
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// BinaryOperator
Abv->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // OpCode and HasFPFeatures
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
BinaryOperatorAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_COMPOUND_ASSIGN_OPERATOR
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_COMPOUND_ASSIGN_OPERATOR));
// Stmt
// Expr
// Packing Bits: DependenceKind. ValueKind and ObjectKind should
// be 0 in this case.
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 5));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// BinaryOperator
// Packing Bits: OpCode. The HasFPFeatures bit should be 0
Abv->Add(
BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // OpCode and HasFPFeatures
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
// CompoundAssignOperator
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // LHSType
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Result Type
CompoundAssignOperatorAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_CALL
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_CALL));
// Stmt
// Expr
// Packing Bits: DependenceKind, ValueKind, ObjectKind,
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 10));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// CallExpr
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // NumArgs
Abv->Add(BitCodeAbbrevOp(0)); // ADLCallKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
CallExprAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_CXX_OPERATOR_CALL
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_CXX_OPERATOR_CALL));
// Stmt
// Expr
// Packing Bits: DependenceKind, ValueKind, ObjectKind,
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 10));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// CallExpr
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // NumArgs
Abv->Add(BitCodeAbbrevOp(0)); // ADLCallKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
// CXXOperatorCallExpr
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Operator Kind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
CXXOperatorCallExprAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for EXPR_CXX_MEMBER_CALL
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::EXPR_CXX_MEMBER_CALL));
// Stmt
// Expr
// Packing Bits: DependenceKind, ValueKind, ObjectKind,
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Fixed, 10));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Type
// CallExpr
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // NumArgs
Abv->Add(BitCodeAbbrevOp(0)); // ADLCallKind
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
// CXXMemberCallExpr
CXXMemberCallExprAbbrev = Stream.EmitAbbrev(std::move(Abv));
// Abbreviation for STMT_COMPOUND
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::STMT_COMPOUND));
// Stmt
// CompoundStmt
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Num Stmts
Abv->Add(BitCodeAbbrevOp(0)); // hasStoredFPFeatures
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::VBR, 6)); // Source Location
CompoundStmtAbbrev = Stream.EmitAbbrev(std::move(Abv));
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_CONTEXT_LEXICAL));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
DeclContextLexicalAbbrev = Stream.EmitAbbrev(std::move(Abv));
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_CONTEXT_VISIBLE));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
DeclContextVisibleLookupAbbrev = Stream.EmitAbbrev(std::move(Abv));
[Serialization] Support loading template specializations lazily (#119333) Reland https://github.com/llvm/llvm-project/pull/83237 --- (Original comments) Currently all the specializations of a template (including instantiation, specialization and partial specializations) will be loaded at once if we want to instantiate another instance for the template, or find instantiation for the template, or just want to complete the redecl chain. This means basically we need to load every specializations for the template once the template declaration got loaded. This is bad since when we load a specialization, we need to load all of its template arguments. Then we have to deserialize a lot of unnecessary declarations. For example, ``` // M.cppm export module M; export template <class T> class A {}; export class ShouldNotBeLoaded {}; export class Temp { A<ShouldNotBeLoaded> AS; }; // use.cpp import M; A<int> a; ``` We have a specialization ` A<ShouldNotBeLoaded>` in `M.cppm` and we instantiate the template `A` in `use.cpp`. Then we will deserialize `ShouldNotBeLoaded` surprisingly when compiling `use.cpp`. And this patch tries to avoid that. Given that the templates are heavily used in C++, this is a pain point for the performance. This patch adds MultiOnDiskHashTable for specializations in the ASTReader. Then we will only deserialize the specializations with the same template arguments. We made that by using ODRHash for the template arguments as the key of the hash table. To review this patch, I think `ASTReaderDecl::AddLazySpecializations` may be a good entry point.
2024-12-11 09:40:47 +08:00
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_SPECIALIZATIONS));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
DeclSpecializationsAbbrev = Stream.EmitAbbrev(std::move(Abv));
Abv = std::make_shared<BitCodeAbbrev>();
Abv->Add(BitCodeAbbrevOp(serialization::DECL_PARTIAL_SPECIALIZATIONS));
Abv->Add(BitCodeAbbrevOp(BitCodeAbbrevOp::Blob));
DeclPartialSpecializationsAbbrev = Stream.EmitAbbrev(std::move(Abv));
}
/// isRequiredDecl - Check if this is a "required" Decl, which must be seen by
/// consumers of the AST.
///
/// Such decls will always be deserialized from the AST file, so we would like
/// this to be as restrictive as possible. Currently the predicate is driven by
/// code generation requirements, if other clients have a different notion of
/// what is "required" then we may have to consider an alternate scheme where
/// clients can iterate over the top-level decls and get information on them,
/// without necessary deserializing them. We could explicitly require such
/// clients to use a separate API call to "realize" the decl. This should be
/// relatively painless since they would presumably only do it for top-level
/// decls.
static bool isRequiredDecl(const Decl *D, ASTContext &Context,
Module *WritingModule) {
// Named modules have different semantics than header modules. Every named
// module units owns a translation unit. So the importer of named modules
// doesn't need to deserilize everything ahead of time.
if (WritingModule && WritingModule->isNamedModule()) {
// The PragmaCommentDecl and PragmaDetectMismatchDecl are MSVC's extension.
// And the behavior of MSVC for such cases will leak this to the module
// users. Given pragma is not a standard thing, the compiler has the space
// to do their own decision. Let's follow MSVC here.
if (isa<PragmaCommentDecl, PragmaDetectMismatchDecl>(D))
return true;
return false;
}
// An ObjCMethodDecl is never considered as "required" because its
// implementation container always is.
// File scoped assembly or obj-c or OMP declare target implementation must be
// seen.
[clang-repl] Support statements on global scope in incremental mode. This patch teaches clang to parse statements on the global scope to allow: ``` ./bin/clang-repl clang-repl> int i = 12; clang-repl> ++i; clang-repl> extern "C" int printf(const char*,...); clang-repl> printf("%d\n", i); 13 clang-repl> %quit ``` Generally, disambiguating between statements and declarations is a non-trivial task for a C++ parser. The challenge is to allow both standard C++ to be translated as if this patch does not exist and in the cases where the user typed a statement to be executed as if it were in a function body. Clang's Parser does pretty well in disambiguating between declarations and expressions. We have added DisambiguatingWithExpression flag which allows us to preserve the existing and optimized behavior where needed and implement the extra rules for disambiguating. Only few cases require additional attention: * Constructors/destructors -- Parser::isConstructorDeclarator was used in to disambiguate between ctor-looking declarations and statements on the global scope(eg. `Ns::f()`). * The template keyword -- the template keyword can appear in both declarations and statements. This patch considers the template keyword to be a declaration starter which breaks a few cases in incremental mode which will be tackled later. * The inline (and similar) keyword -- looking at the first token in many cases allows us to classify what is a declaration. * Other language keywords and specifiers -- ObjC/ObjC++/OpenCL/OpenMP rely on pragmas or special tokens which will be handled in subsequent patches. The patch conceptually models a "top-level" statement into a TopLevelStmtDecl. The TopLevelStmtDecl is lowered into a void function with no arguments. We attach this function to the global initializer list to execute the statement blocks in the correct order. Differential revision: https://reviews.llvm.org/D127284
2022-06-08 09:59:40 +00:00
if (isa<FileScopeAsmDecl, TopLevelStmtDecl, ObjCImplDecl>(D))
return true;
if (WritingModule && isPartOfPerModuleInitializer(D)) {
// These declarations are part of the module initializer, and are emitted
// if and when the module is imported, rather than being emitted eagerly.
return false;
}
return Context.DeclMustBeEmitted(D);
}
void ASTWriter::WriteDecl(ASTContext &Context, Decl *D) {
PrettyDeclStackTraceEntry CrashInfo(Context, D, SourceLocation(),
"serializing");
// Determine the ID for this declaration.
LocalDeclID ID;
assert(!D->isFromASTFile() && "should not be emitting imported decl");
LocalDeclID &IDR = DeclIDs[D];
if (IDR.isInvalid())
IDR = NextDeclID++;
ID = IDR;
assert(ID >= FirstDeclID && "invalid decl ID");
RecordData Record;
ASTDeclWriter W(*this, Context, Record, GeneratingReducedBMI);
// Build a record for this declaration
W.Visit(D);
// Emit this declaration to the bitstream.
uint64_t Offset = W.Emit(D);
// Record the offset for this declaration
SourceLocation Loc = D->getLocation();
Reland "[Modules] No transitive source location change (#86912)" This relands 6c31104. The patch was reverted due to incorrectly introduced alignment. And the patch was re-commited after fixing the alignment issue. Following off are the original message: This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` | | | _____ base offset of an imported module | | | |_____ base offset of another imported module | | | | | ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` |-----------------------|-----------------------| | A | B | C | * A: 32 bit. The index of the module file in the module manager + 1. * The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file * containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save * some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.
2024-05-06 10:41:42 +08:00
SourceLocationEncoding::RawLocEncoding RawLoc =
getRawSourceLocationEncoding(getAdjustedLocation(Loc));
unsigned Index = ID.getRawValue() - FirstDeclID.getRawValue();
if (DeclOffsets.size() == Index)
Reland "[Modules] No transitive source location change (#86912)" This relands 6c31104. The patch was reverted due to incorrectly introduced alignment. And the patch was re-commited after fixing the alignment issue. Following off are the original message: This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` | | | _____ base offset of an imported module | | | |_____ base offset of another imported module | | | | | ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` |-----------------------|-----------------------| | A | B | C | * A: 32 bit. The index of the module file in the module manager + 1. * The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file * containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save * some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.
2024-05-06 10:41:42 +08:00
DeclOffsets.emplace_back(RawLoc, Offset, DeclTypesBlockStartOffset);
else if (DeclOffsets.size() < Index) {
// FIXME: Can/should this happen?
DeclOffsets.resize(Index+1);
Reland "[Modules] No transitive source location change (#86912)" This relands 6c31104. The patch was reverted due to incorrectly introduced alignment. And the patch was re-commited after fixing the alignment issue. Following off are the original message: This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` | | | _____ base offset of an imported module | | | |_____ base offset of another imported module | | | | | ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` |-----------------------|-----------------------| | A | B | C | * A: 32 bit. The index of the module file in the module manager + 1. * The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file * containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save * some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.
2024-05-06 10:41:42 +08:00
DeclOffsets[Index].setRawLoc(RawLoc);
DeclOffsets[Index].setBitOffset(Offset, DeclTypesBlockStartOffset);
} else {
llvm_unreachable("declarations should be emitted in ID order");
}
SourceManager &SM = Context.getSourceManager();
if (Loc.isValid() && SM.isLocalSourceLocation(Loc))
associateDeclWithFile(D, ID);
// Note declarations that should be deserialized eagerly so that we can add
// them to a record in the AST file later.
if (isRequiredDecl(D, Context, WritingModule))
AddDeclRef(D, EagerlyDeserializedDecls);
}
void ASTRecordWriter::AddFunctionDefinition(const FunctionDecl *FD) {
// Switch case IDs are per function body.
Writer->ClearSwitchCaseIDs();
assert(FD->doesThisDeclarationHaveABody());
bool ModulesCodegen = false;
if (!FD->isDependentContext()) {
std::optional<GVALinkage> Linkage;
if (Writer->WritingModule &&
Writer->WritingModule->isInterfaceOrPartition()) {
// When building a C++20 module interface unit or a partition unit, a
// strong definition in the module interface is provided by the
// compilation of that unit, not by its users. (Inline functions are still
// emitted in module users.)
Linkage = getASTContext().GetGVALinkageForFunction(FD);
ModulesCodegen = *Linkage >= GVA_StrongExternal;
}
if (Writer->getLangOpts().ModulesCodegen ||
Merge some of the PCH object support with modular codegen I was trying to pick this up a bit when reviewing D48426 (& perhaps D69778) - in any case, looks like D48426 added a module level flag that might not be needed. The D48426 implementation worked by setting a module level flag, then code generating contents from the PCH a special case in ASTContext::DeclMustBeEmitted would be used to delay emitting the definition of these functions if they came from a Module with this flag. This strategy is similar to the one initially implemented for modular codegen that was removed in D29901 in favor of the modular decls list and a bit on each decl to specify whether it's homed to a module. One major difference between PCH object support and modular code generation, other than the specific list of decls that are homed, is the compilation model: MSVC PCH modules are built into the object file for some other source file (when compiling that source file /Yc is specified to say "this compilation is where the PCH is homed"), whereas modular code generation invokes a separate compilation for the PCH alone. So the current modular code generation test of to decide if a decl should be emitted "is the module where this decl is serialized the current main file" has to be extended (as Lubos did in D69778) to also test the command line flag -building-pch-with-obj. Otherwise the whole thing is basically streamlined down to the modular code generation path. This even offers one extra material improvement compared to the existing divergent implementation: Homed functions are not emitted into object files that use the pch. Instead at -O0 they are not emitted into the IR at all, and at -O1 they are emitted using available_externally (existing functionality implemented for modular code generation). The pch-codegen test has been updated to reflect this new behavior. [If possible: I'd love it if we could not have the extra MSVC-style way of accessing dllexport-pch-homing, and just do it the modular codegen way, but I understand that it might be a limitation of existing build systems. @hans / @thakis: Do either of you know if it'd be practical to move to something more similar to .pcm handling, where the pch itself is passed to the compilation, rather than homed as a side effect of compiling some other source file?] Reviewers: llunak, hans Differential Revision: https://reviews.llvm.org/D83652
2020-07-12 15:36:56 -07:00
(FD->hasAttr<DLLExportAttr>() &&
Writer->getLangOpts().BuildingPCHWithObjectFile)) {
Merge some of the PCH object support with modular codegen I was trying to pick this up a bit when reviewing D48426 (& perhaps D69778) - in any case, looks like D48426 added a module level flag that might not be needed. The D48426 implementation worked by setting a module level flag, then code generating contents from the PCH a special case in ASTContext::DeclMustBeEmitted would be used to delay emitting the definition of these functions if they came from a Module with this flag. This strategy is similar to the one initially implemented for modular codegen that was removed in D29901 in favor of the modular decls list and a bit on each decl to specify whether it's homed to a module. One major difference between PCH object support and modular code generation, other than the specific list of decls that are homed, is the compilation model: MSVC PCH modules are built into the object file for some other source file (when compiling that source file /Yc is specified to say "this compilation is where the PCH is homed"), whereas modular code generation invokes a separate compilation for the PCH alone. So the current modular code generation test of to decide if a decl should be emitted "is the module where this decl is serialized the current main file" has to be extended (as Lubos did in D69778) to also test the command line flag -building-pch-with-obj. Otherwise the whole thing is basically streamlined down to the modular code generation path. This even offers one extra material improvement compared to the existing divergent implementation: Homed functions are not emitted into object files that use the pch. Instead at -O0 they are not emitted into the IR at all, and at -O1 they are emitted using available_externally (existing functionality implemented for modular code generation). The pch-codegen test has been updated to reflect this new behavior. [If possible: I'd love it if we could not have the extra MSVC-style way of accessing dllexport-pch-homing, and just do it the modular codegen way, but I understand that it might be a limitation of existing build systems. @hans / @thakis: Do either of you know if it'd be practical to move to something more similar to .pcm handling, where the pch itself is passed to the compilation, rather than homed as a side effect of compiling some other source file?] Reviewers: llunak, hans Differential Revision: https://reviews.llvm.org/D83652
2020-07-12 15:36:56 -07:00
// Under -fmodules-codegen, codegen is performed for all non-internal,
// non-always_inline functions, unless they are available elsewhere.
if (!FD->hasAttr<AlwaysInlineAttr>()) {
if (!Linkage)
Linkage = getASTContext().GetGVALinkageForFunction(FD);
ModulesCodegen =
*Linkage != GVA_Internal && *Linkage != GVA_AvailableExternally;
}
}
}
Record->push_back(ModulesCodegen);
if (ModulesCodegen)
Writer->AddDeclRef(FD, Writer->ModularCodegenDecls);
if (auto *CD = dyn_cast<CXXConstructorDecl>(FD)) {
Record->push_back(CD->getNumCtorInitializers());
if (CD->getNumCtorInitializers())
AddCXXCtorInitializers(llvm::ArrayRef(CD->init_begin(), CD->init_end()));
}
AddStmt(FD->getBody());
}