This patch swaps out the `void *` behind `CXFile` from `FileEntry *` to `FileEntryRef::MapEntry *`. This allows us to remove some deprecated uses of `FileEntry::getName()`.
Depends on D151854.
Reviewed By: benlangmuir
Differential Revision: https://reviews.llvm.org/D151938
I assume it's optional and ReadAST does not have to set the
counter on success.
Without the patch msan complains that we pass
uninitialized value into noundef parameters of setCounterValue.
Reviewed By: bnbarham, barannikov88
Differential Revision: https://reviews.llvm.org/D150492
The original name "ASTContext::getNamedModuleForCodeGen" is not properly
reflecting the usage of the interface. This interface can be used to
judge the current module unit in both sema analysis and code generation.
So the original name was not so correct.
Decl::isInCurrentModuleUnit
Refactor `Sema::isModuleUnitOfCurrentTU` to `Decl::isInCurrentModuleUnit`
to make code simpler a little bit. Note that although this patch
introduces a FIXME, this is an existing issue and this patch just tries
to describe it explicitly.
We have no use for debug info for the scanner modules, and writing raw
ast files speeds up scanning ~15% in some cases. Note that the compile
commands produced by the scanner will still build the obj format (if
requested), and the scanner can *read* obj format pcms, e.g. from a PCH.
rdar://108807592
Differential Revision: https://reviews.llvm.org/D149693
This commit allows libclang API users to opt into storing PCH in memory
instead of temporary files. The option can be set only during CXIndex
construction to avoid multithreading issues and confusion or bugs if
some preambles are stored in temporary files and others - in memory.
The added API works as expected in KDevelop:
https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/283
Differential Revision: https://reviews.llvm.org/D145974
TempPCHFile::create() calls llvm::sys::fs::createTemporaryFile() to
create a file named preamble-*.pch in a system temporary directory. This
commit allows overriding the directory where these often many and large
preamble-*.pch files are stored.
The referenced bug report requests the ability to override the temporary
directory path used by libclang. However, overriding the return value of
llvm::sys::path::system_temp_directory() was rejected during code review
as improper and because it would negatively affect multithreading
performance. Finding all places where libclang uses the temporary
directory is very difficult. Therefore this commit is limited to
override libclang's single known use of the temporary directory.
This commit allows to override the preamble storage path only during
CXIndex construction to avoid multithreading issues and ensure that all
preambles are stored in the same directory. For the same multithreading
and consistency reasons, this commit deprecates
clang_CXIndex_setGlobalOptions() and
clang_CXIndex_setInvocationEmissionPathOption() in favor of specifying
these options during CXIndex construction.
Adding a new CXIndex constructor function each time a new initialization
argument is needed leads to either a large number of function parameters
unneeded by most libclang users or to an exponential number of overloads
that support different usage requirements. Therefore this commit
introduces a new extensible struct CXIndexOptions and a general function
clang_createIndexWithOptions().
A libclang user passes a desired preamble storage path to
clang_createIndexWithOptions(), which stores it in
CIndexer::PreambleStoragePath. Whenever
clang_parseTranslationUnit_Impl() is called, it passes
CIndexer::PreambleStoragePath to ASTUnit::LoadFromCommandLine(), which
stores this argument in ASTUnit::PreambleStoragePath. Whenever
ASTUnit::getMainBufferWithPrecompiledPreamble() is called, it passes
ASTUnit::PreambleStoragePath to PrecompiledPreamble::Build().
PrecompiledPreamble::Build() forwards the corresponding StoragePath
argument to TempPCHFile::create(). If StoragePath is not empty,
TempPCHFile::create() stores the preamble-*.pch file in the directory at
the specified path rather than in the system temporary directory.
The analysis below proves that this passing around of the
PreambleStoragePath string is sufficient to guarantee that the libclang
user override is used in TempPCHFile::create(). The analysis ignores API
uses in test code.
TempPCHFile::create() is called only in PrecompiledPreamble::Build().
PrecompiledPreamble::Build() is called only in two places: one in
clangd, which is not used by libclang, and one in
ASTUnit::getMainBufferWithPrecompiledPreamble().
ASTUnit::getMainBufferWithPrecompiledPreamble() is called in 3 places:
ASTUnit::LoadFromCompilerInvocation() [analyzed below].
ASTUnit::Reparse(), which in turn is called only from
clang_reparseTranslationUnit_Impl(), which in turn is called only from
clang_reparseTranslationUnit(). clang_reparseTranslationUnit() is never
called in LLVM code, but is part of public libclang API. This function's
documentation requires its translation unit argument to have been built
with clang_createTranslationUnitFromSourceFile().
clang_createTranslationUnitFromSourceFile() delegates its work to
clang_parseTranslationUnit(), which delegates to
clang_parseTranslationUnit2(), which delegates to
clang_parseTranslationUnit2FullArgv(), which delegates to
clang_parseTranslationUnit_Impl(), which passes
CIndexer::PreambleStoragePath to the ASTUnit it creates.
ASTUnit::CodeComplete() passes AllowRebuild = false to
ASTUnit::getMainBufferWithPrecompiledPreamble(), which makes it return
nullptr before calling PrecompiledPreamble::Build().
Both ASTUnit::LoadFromCompilerInvocation() overloads (one of which
delegates its work to another) call
ASTUnit::getMainBufferWithPrecompiledPreamble() only if their argument
PrecompilePreambleAfterNParses > 0. LoadFromCompilerInvocation() is
called in:
ASTBuilderAction::runInvocation() keeps the default parameter value
of PrecompilePreambleAfterNParses = 0, meaning that the preamble file is
never created from here.
ASTUnit::LoadFromCommandLine().
ASTUnit::LoadFromCommandLine() is called in two places:
CrossTranslationUnitContext::ASTLoader::loadFromSource() keeps the
default parameter value of PrecompilePreambleAfterNParses = 0, meaning
that the preamble file is never created from here.
clang_parseTranslationUnit_Impl(), which passes
CIndexer::PreambleStoragePath to the ASTUnit it creates.
Therefore, the overridden preamble storage path is always used in
TempPCHFile::create().
TempPCHFile::create() uses PreambleStoragePath in the same way as
LibclangInvocationReporter() uses InvocationEmissionPath. The existing
documentation for clang_CXIndex_setInvocationEmissionPathOption() does
not specify ownership, encoding, separator or relative vs absolute path
requirements. So the documentation for
CXIndexOptions::PreambleStoragePath doesn't either. The assumptions are:
no ownership transfer;
UTF-8 encoding;
native separators.
Both relative and absolute paths are supported.
The added API works as expected in KDevelop:
https://invent.kde.org/kdevelop/kdevelop/-/merge_requests/283
Fixes: https://github.com/llvm/llvm-project/issues/51847
Differential Revision: https://reviews.llvm.org/D143418
Previously we'll set the named modules for ASTContext in ParseAST. But
this is not intuitive and we need comments to tell the intuition. This
patch moves the code the right the place, where the corrresponding
module is first created/loaded. Now it is more intuitive and we can use
the value in the earlier places.
This reverts commit c5abe893120b115907376359a5809229a9f9608a.
This reverts commit a033dbbe5c43247b60869b008e67ed86ed230eaa.
This broke the build with -DLLVM_LINK_LLVM_DYLIB=ON. Reverting while I
investigate.
Every Clang instance uses an internal FileSystemStatCache to avoid
stating the same content multiple times. However, different instances
of Clang will contend for filesystem access for their initial stats
during HeaderSearch or module validation.
On some workloads, the time spent in the kernel in these concurrent
stat calls has been measured to be over 20% of the overall compilation
time. This is extremly wassteful when most of the stat calls target
mostly immutable content like a SDK.
This commit introduces a new tool `clang-stat-cache` able to generate
an OnDiskHashmap containing the stat data for a given filesystem
hierarchy.
The driver part of this has been modeled after -ivfsoverlay given
the similarities with what it influences. It introduces a new
-ivfsstatcache driver option to instruct Clang to use a stat cache
generated by `clang-stat-cache`. These stat caches are inserted at
the bottom of the VFS stack (right above the real filesystem).
Differential Revision: https://reviews.llvm.org/D136651
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated. The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.
This is part of an effort to migrate from llvm::Optional to
std::optional:
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
With implicitly built modules, the importing `CompilerInstance` assumes PCMs were built in a "compatible way" (i.e. with similarly set up instance). Either because their context hash matches, or because this instance has just built them.
There are some use-cases, however, where this assumption doesn't hold, libclang/c-index-test being one of them. There, the importing instance (or `ASTUnit`) is being set up while the PCM file is being deserialized. Until now, we've assumed the serialized paths to input files are the actual on-disk files, meaning the default physical VFS was always able to resolve them. This won't be the case after D135636. Therefore, this patch makes sure `ASTUnit` is initialized with the same VFS as the PCM it's deserializing - by storing paths to the VFS overlay files into the PCM itself.
For the VFS overlay files to be adopted at the very start of PCM deserialization, they are stored in a new section in the unhashed control block, together with header search paths and system header prefixes. The move to the unhashed control block should be safe: if two modules were built with different header search paths and they produced different results, the hashed part of the PCM file will reflect that.
Reviewed By: akyrtzi, benlangmuir
Differential Revision: https://reviews.llvm.org/D135634
This is generally a better default for tools other than the compiler, which
shouldn't assume a PCH file on disk is something they can consume.
Preserve the old behavior in places associated with libclang/c-index-test
(including ASTUnit) as there are tests relying on it and most important
consumers are out-of-tree. It's unclear whether the tests are specifically
trying to test this functionality, and what the downstream implications of
removing it are. Hopefully someone more familiar can clean this up in future.
Differential Revision: https://reviews.llvm.org/D125149
Implementation is based on the "expected type" as used for
designated-initializers in braced init lists. This means it can deduce the type
in some cases where it's not written:
void foo(Widget);
foo({ /*help here*/ });
Only basic constructor calls are in scope of this patch, excluded are:
- aggregate initialization (no help is offered for aggregates)
- initializer_list initialization (no help is offered for these constructors)
Fixes https://github.com/clangd/clangd/issues/306
Differential Revision: https://reviews.llvm.org/D116317
Only the bare name is completed, with no args.
For args to be useful we need arg names. These *are* in the tablegen but
not currently emitted in usable form, so left this as future work.
C++11, C2x, GNU, declspec, MS syntax is supported, with the appropriate
spellings of attributes suggested.
`#pragma clang attribute` is supported but not terribly useful as we
only reach completion if parens are balanced (i.e. the line is not truncated)
There's no filtering of which attributes might make sense in this
grammatical context (e.g. attached to a function). In code-completion context
this is hard to do, and will only work in few cases :-(
There's also no filtering by langopts: this is because currently the
only way of checking is to try to produce diagnostics, which requires a
valid ParsedAttr which is hard to get.
This should be fairly simple to fix but requires some tablegen changes
to expose the logic without the side-effect.
Differential Revision: https://reviews.llvm.org/D107696
Original commit message:
[clang-repl] Implement partial translation units and error recovery.
https://reviews.llvm.org/D96033 contained a discussion regarding efficient
modeling of error recovery. @rjmccall has outlined the key ideas:
Conceptually, we can split the translation unit into a sequence of partial
translation units (PTUs). Every declaration will be associated with a unique PTU
that owns it.
The first key insight here is that the owning PTU isn't always the "active"
(most recent) PTU, and it isn't always the PTU that the declaration
"comes from". A new declaration (that isn't a redeclaration or specialization of
anything) does belong to the active PTU. A template specialization, however,
belongs to the most recent PTU of all the declarations in its signature - mostly
that means that it can be pulled into a more recent PTU by its template
arguments.
The second key insight is that processing a PTU might extend an earlier PTU.
Rolling back the later PTU shouldn't throw that extension away. For example, if
the second PTU defines a template, and the third PTU requires that template to
be instantiated at float, that template specialization is still part of the
second PTU. Similarly, if the fifth PTU uses an inline function belonging to the
fourth, that definition still belongs to the fourth. When we go to emit code in
a new PTU, we map each declaration we have to emit back to its owning PTU and
emit it in a new module for just the extensions to that PTU. We keep track of
all the modules we've emitted for a PTU so that we can unload them all if we
decide to roll it back.
Most declarations/definitions will only refer to entities from the same or
earlier PTUs. However, it is possible (primarily by defining a
previously-declared entity, but also through templates or ADL) for an entity
that belongs to one PTU to refer to something from a later PTU. We will have to
keep track of this and prevent unwinding to later PTU when we recognize it.
Fortunately, this should be very rare; and crucially, we don't have to do the
bookkeeping for this if we've only got one PTU, e.g. in normal compilation.
Otherwise, PTUs after the first just need to record enough metadata to be able
to revert any changes they've made to declarations belonging to earlier PTUs,
e.g. to redeclaration chains or template specialization lists.
It should even eventually be possible for PTUs to provide their own slab
allocators which can be thrown away as part of rolling back the PTU. We can
maintain a notion of the active allocator and allocate things like Stmt/Expr
nodes in it, temporarily changing it to the appropriate PTU whenever we go to do
something like instantiate a function template. More care will be required when
allocating declarations and types, though.
We would want the PTU to be efficiently recoverable from a Decl; I'm not sure
how best to do that. An easy option that would cover most declarations would be
to make multiple TranslationUnitDecls and parent the declarations appropriately,
but I don't think that's good enough for things like member function templates,
since an instantiation of that would still be parented by its original class.
Maybe we can work this into the DC chain somehow, like how lexical DCs are.
We add a different kind of translation unit `TU_Incremental` which is a
complete translation unit that we might nonetheless incrementally extend later.
Because it is complete (and we might want to generate code for it), we do
perform template instantiation, but because it might be extended later, we don't
warn if it declares or uses undefined internal-linkage symbols.
This patch teaches clang-repl how to recover from errors by disconnecting the
most recent PTU and update the primary PTU lookup tables. For instance:
```./clang-repl
clang-repl> int i = 12; error;
In file included from <<< inputs >>>:1:
input_line_0:1:13: error: C++ requires a type specifier for all declarations
int i = 12; error;
^
error: Parsing failed.
clang-repl> int i = 13; extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=13
clang-repl> quit
```
Differential revision: https://reviews.llvm.org/D104918
This reverts commit 6775fc6ffa3ca1c36b20c25fa4e7f48f81213cf2.
It also reverts "[lldb] Fix compilation by adjusting to the new ASTContext signature."
This reverts commit 03a3f86071c10a1f6cbbf7375aa6fe9d94168972.
We see some failures on the lldb infrastructure, these changes might play a role
in it. Let's revert it now and see if the bots will become green.
Ref: https://reviews.llvm.org/D104918
https://reviews.llvm.org/D96033 contained a discussion regarding efficient
modeling of error recovery. @rjmccall has outlined the key ideas:
Conceptually, we can split the translation unit into a sequence of partial
translation units (PTUs). Every declaration will be associated with a unique PTU
that owns it.
The first key insight here is that the owning PTU isn't always the "active"
(most recent) PTU, and it isn't always the PTU that the declaration
"comes from". A new declaration (that isn't a redeclaration or specialization of
anything) does belong to the active PTU. A template specialization, however,
belongs to the most recent PTU of all the declarations in its signature - mostly
that means that it can be pulled into a more recent PTU by its template
arguments.
The second key insight is that processing a PTU might extend an earlier PTU.
Rolling back the later PTU shouldn't throw that extension away. For example, if
the second PTU defines a template, and the third PTU requires that template to
be instantiated at float, that template specialization is still part of the
second PTU. Similarly, if the fifth PTU uses an inline function belonging to the
fourth, that definition still belongs to the fourth. When we go to emit code in
a new PTU, we map each declaration we have to emit back to its owning PTU and
emit it in a new module for just the extensions to that PTU. We keep track of
all the modules we've emitted for a PTU so that we can unload them all if we
decide to roll it back.
Most declarations/definitions will only refer to entities from the same or
earlier PTUs. However, it is possible (primarily by defining a
previously-declared entity, but also through templates or ADL) for an entity
that belongs to one PTU to refer to something from a later PTU. We will have to
keep track of this and prevent unwinding to later PTU when we recognize it.
Fortunately, this should be very rare; and crucially, we don't have to do the
bookkeeping for this if we've only got one PTU, e.g. in normal compilation.
Otherwise, PTUs after the first just need to record enough metadata to be able
to revert any changes they've made to declarations belonging to earlier PTUs,
e.g. to redeclaration chains or template specialization lists.
It should even eventually be possible for PTUs to provide their own slab
allocators which can be thrown away as part of rolling back the PTU. We can
maintain a notion of the active allocator and allocate things like Stmt/Expr
nodes in it, temporarily changing it to the appropriate PTU whenever we go to do
something like instantiate a function template. More care will be required when
allocating declarations and types, though.
We would want the PTU to be efficiently recoverable from a Decl; I'm not sure
how best to do that. An easy option that would cover most declarations would be
to make multiple TranslationUnitDecls and parent the declarations appropriately,
but I don't think that's good enough for things like member function templates,
since an instantiation of that would still be parented by its original class.
Maybe we can work this into the DC chain somehow, like how lexical DCs are.
We add a different kind of translation unit `TU_Incremental` which is a
complete translation unit that we might nonetheless incrementally extend later.
Because it is complete (and we might want to generate code for it), we do
perform template instantiation, but because it might be extended later, we don't
warn if it declares or uses undefined internal-linkage symbols.
This patch teaches clang-repl how to recover from errors by disconnecting the
most recent PTU and update the primary PTU lookup tables. For instance:
```./clang-repl
clang-repl> int i = 12; error;
In file included from <<< inputs >>>:1:
input_line_0:1:13: error: C++ requires a type specifier for all declarations
int i = 12; error;
^
error: Parsing failed.
clang-repl> int i = 13; extern "C" int printf(const char*,...);
clang-repl> auto r1 = printf("i=%d\n", i);
i=13
clang-repl> quit
```
Differential revision: https://reviews.llvm.org/D104918
As proposed in D97109, I tried to make target creation consistent in `clang` and `clangd` by replacing the original procedure with a single function introduced in D97493.
This also helps `clangd` works with CUDA, OpenMP, etc.
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D98128
Clarify that `PrecompiledPreamble::CanReuse` requires non-null arguments
for `VFS` and `MainFileBuffer`, taking them by reference instead of by
pointer.
Differential Revision: https://reviews.llvm.org/D91297
This addresses an issue with how the PCH preable works, specifically:
1. When using a PCH/preamble the module hash changes and a different cache directory is used
2. When the preamble is used, PCH & PCM validation is disabled.
Due to combination of #1 and #2, reparsing with preamble enabled can end up loading a stale module file before a header change and using it without updating it because validation is disabled and it doesn’t check that the header has changed and the module file is out-of-date.
rdar://72611253
Differential Revision: https://reviews.llvm.org/D95159
Clarify the logic for using the preamble (and overriding the main file
buffer) in `ASTUnit::CodeComplete` by factoring out a couple of lambdas
(`getUniqueID` and `hasSameUniqueID`). While refactoring the logic,
hoist the check for `Line > 1` and locally check if the filenames are
equal (both to avoid unnecessary `stat` calls) and skip copying out the
filenames to `std::string`.
Besides fewer calls to `stat`, there's no functionality change here.
Differential Revision: https://reviews.llvm.org/D91296
`ASTUnit::Parse` sets up the `FileManager` earlier in the same function,
ensuring `ASTUnit::getFileManager()` matches `Clang->getFileManager()`.
Remove the later call to `setFileManager(getFileManager())` since it
does nothing.
Differential Revision: https://reviews.llvm.org/D90888
As with precompiled headers, it's useful for indexers to be able to
continue through compiler errors in dependent modules.
Resolves rdar://69816264
Reviewed By: akyrtzi
Differential Revision: https://reviews.llvm.org/D91580
Avoid requiring an actual MemoryBuffer in ComputePreambleBounds, when
a MemoryBufferRef will do just fine.
Differential Revision: https://reviews.llvm.org/D90890
Some early errors during the ASTUnit creation were not transferred to the `FailedParseDiagnostic` so when the code in `LoadFromCommandLine` swaps its content with the content of `StoredDiagnostics` they cannot be retrieved by the user in any way.
Reviewed By: andrewrk, dblaikie
Differential Revision: https://reviews.llvm.org/D78658
In order to drop the final callers to `SourceManager::getBuffer`, change
`FrontendInputFile` to use `Optional<MemoryBufferRef>`. Also updated
the "unowned" version of `SourceManager::createFileID` to take a
`MemoryBufferRef` (it now calls `MemoryBuffer::getMemBuffer`, which
creates a `MemoryBuffer` that does not own the buffer data).
Differential Revision: https://reviews.llvm.org/D89427