Alternatives to https://reviews.llvm.org/D153114.
Try to address https://github.com/clangd/clangd/issues/1293.
See the links for design ideas and the consensus so far. We want to have
some initial support in clang18.
This is the initial support for C++20 Modules in clangd.
As suggested by sammccall in https://reviews.llvm.org/D153114,
we should minimize the scope of the initial patch to make it easier
to review and understand so that every one are in the same page:
> Don't attempt any cross-file or cross-version coordination: i.e. don't
> try to reuse BMIs between different files, don't try to reuse BMIs
> between (preamble) reparses of the same file, don't try to persist the
> module graph. Instead, when building a preamble, synchronously scan
> for the module graph, build the required PCMs on the single preamble
> thread with filenames private to that preamble, and then proceed to
> build the preamble.
This patch reflects the above opinions.
# Testing in real-world project
I tested this with a modularized library:
https://github.com/alibaba/async_simple/tree/CXX20Modules. This library
has 3 modules (async_simple, std and asio) and 65 module units. (Note
that a module consists of multiple module units). Both `std` module and
`asio` module have 100k+ lines of code (maybe more, I didn't count). And
async_simple itself has 8k lines of code. This is the scale of the
project.
The result shows that it works pretty well, ..., well, except I need to
wait roughly 10s after opening/editing any file. And this falls in our
expectations. We know it is hard to make it perfect in the first move.
# What this patch does in detail
- Introduced an option `--experimental-modules-support` for the support
for C++20 Modules. So that no matter how bad this is, it wouldn't affect
current users. Following off the page, we'll assume the option is
enabled.
- Introduced two classes `ModuleFilesInfo` and
`ModuleDependencyScanner`. Now `ModuleDependencyScanner` is only used by
`ModuleFilesInfo`.
- The class `ModuleFilesInfo` records the built module files for
specific single source file. The module files can only be built by the
static member function `ModuleFilesInfo::buildModuleFilesInfoFor(PathRef
File, ...)`.
- The class `PreambleData` adds a new member variable with type
`ModuleFilesInfo`. This refers to the needed module files for the
current file. It means the module files info is part of the preamble,
which is suggested in the first patch too.
- In `isPreambleCompatible()`, we add a call to
`ModuleFilesInfo::CanReuse()` to check if the built module files are
still up to date.
- When we build the AST for a source file, we will load the built module
files from ModuleFilesInfo.
# What we need to do next
Let's split the TODOs into clang part and clangd part to make things
more clear.
The TODOs in the clangd part include:
1. Enable reusing module files across source files. The may require us
to bring a ModulesManager like thing which need to handle `scheduling`,
`the possibility of BMI version conflicts` and `various events that can
invalidate the module graph`.
2. Get a more efficient method to get the `<module-name> ->
<module-unit-source>` map. Currently we always scan the whole project
during `ModuleFilesInfo::buildModuleFilesInfoFor(PathRef File, ...)`.
This is clearly inefficient even if the scanning process is pretty fast.
I think the potential solutions include:
- Make a global scanner to monitor the state of every source file like I
did in the first patch. The pain point is that we need to take care of
the data races.
- Ask the build systems to provide the map just like we ask them to
provide the compilation database.
3. Persist the module files. So that we can reuse module files across
clangd invocations or even across clangd instances.
TODOs in the clang part include:
1. Clang should offer an option/mode to skip writing/reading the bodies
of the functions. Or even if we can requrie the parser to skip parsing
the function bodies.
And it looks like we can say the support for C++20 Modules is initially
workable after we made (1) and (2) (or even without (2)).
Building ASTs with compile flags that are incompatible to the ones used
for the Preamble are not really supported by clang and can trigger
crashes.
In an ideal world, we should be re-using not only TargetOpts, but the
full ParseInputs from the Preamble to prevent such failures.
Unfortunately current contracts of ThreadSafeFS makes this a non-safe
change for certain implementations. As there are no guarantees that the
same ThreadSafeFS is going to be valid in the Context::current() we're
building the AST in.
We've been running this internally for months now, without any
stability or correctness concerns. It has ~40% speed up on incremental
diagnostics latencies (as preamble can get invalidated through code completion
etc.).
Differential Revision: https://reviews.llvm.org/D153882
We would like to move the preamble index out of the critical path.
This patch is an RFC to get feedback on the correct implementation and potential pitfalls to keep into consideration.
I am not entirely sure if the lazy AST initialisation would create using Preamble AST in parallel. I tried with tsan enabled clangd but it seems to work OK (at least for the cases I tried)
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D148088
Since we redefine all macros in preamble-patch, and it's parsed after
consuming the preamble macros, we can get false missing-include diagnostics
while a fresh preamble is being rebuilt.
This patch makes sure preamble-patch is treated same as main file for
include-cleaner purposes.
Differential Revision: https://reviews.llvm.org/D148143
Translates diagnostics from baseline preamble to relevant modified
contents.
Translation is done by looking for a set of lines that have the same
contents in diagnostic/note/fix ranges inside baseline and modified
contents.
A diagnostic is preserved if its main range is outside of main file or
there's a translation from baseline to modified contents. Later on fixes
and notes attached to that diagnostic with relevant ranges are also
translated and preserved.
Depends on D143095
Differential Revision: https://reviews.llvm.org/D143096
Also wire it up for use with patched preambles and introduce test cases
for behaviour we'd like to improve.
Differential Revision: https://reviews.llvm.org/D142890
A prototype of using include-cleaner library in clangd:
- (re)implement clangd's "unused include" warnings with the library
- the new implementation is hidden under a flag `Config::UnusedIncludesPolicy::Experiment`
Differential Revision: https://reviews.llvm.org/D140875
In some deployments, for example when running on FUSE or using some
network-based VFS implementation, the filesystem operations might add up
to a significant fraction of preamble build time. This change allows us
to track time spent in FS operations to better understand the problem.
Differential Revision: https://reviews.llvm.org/D121712
Xcode uses `#pragma mark -` to draw a divider in the outline view
and `#pragma mark Note` to add `Note` in the outline view. For more
information, see https://nshipster.com/pragma/.
Since the LSP spec doesn't contain dividers for the symbol outline,
instead we treat `#pragma mark -` as a group with children - the
decls that come after it, implicitly terminating when the symbol's
parent ends.
The following code:
```
@implementation MyClass
- (id)init {}
- (int)foo;
@end
```
Would give an outline like
```
MyClass
> Overrides
> init
> Public Accessors
> foo
```
Differential Revision: https://reviews.llvm.org/D105904
Fixes https://github.com/clangd/clangd/issues/819
SourceLocation of macros change when a header file is included above it. This is not checked when creating a PreamblePatch, resulting in reusing previously built preamble with an incorrect source location for the macro in the example test case.
This patch stores the SourceLocation in the struct TextualPPDirective so that it gets checked when comparing old vs new preambles.
Also creates a preamble patch for code completion parsing so that clangd does not crash when following the example test case with a large file.
Reviewed By: kadircet
Differential Revision: https://reviews.llvm.org/D108045
Xcode uses `#pragma mark -` to draw a divider in the outline view
and `#pragma mark Note` to add `Note` in the outline view. For more
information, see https://nshipster.com/pragma/.
Since the LSP spec doesn't contain dividers for the symbol outline,
instead we treat `#pragma mark -` as a group with children - the
decls that come after it, implicitly terminating when the symbol's
parent ends.
The following code:
```
@implementation MyClass
- (id)init {}
- (int)foo;
@end
```
Would give an outline like
```
MyClass
> Overrides
> init
> Public Accessors
> foo
```
Differential Revision: https://reviews.llvm.org/D105904
Implement initial support for pull-based diagnostics in ClangdServer.
This is planned for LSP 3.17, and initial proposal is in
d15eb0671e/protocol/src/common/proposed.diagnostic.ts (L111).
We chose to serve the requests only when clangd has a fresh preamble
available. In case of a stale preamble we just drop the request on the
floor.
This patch doesn't plumb this to LSP layer yet, as pullDiags is still a
proposal with only an implementation in vscode.
Differential Revision: https://reviews.llvm.org/D98623
Summary:
Clangd was using bounds from the stale preamble, which might result in
crashes. For example:
```
#include "a.h"
#include "b.h" // this line is newly inserted
#include "c.h"
```
PreambleBounds for the baseline only contains first two lines, but
ReplayPreamble logic contains an include from the third line. This would
result in a crash as we only lex preamble part of the current file
during ReplayPreamble.
This patch adds a `preambleBounds` method to PreamblePatch, which can be
used to figure out preamble bounds for the current version of the file.
Then uses it when attaching ReplayPreamble, so that it can lex the
up-to-date preamble region.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D81964
Summary: Depends on D79992.
This patch changes locateMacroAt to perform #line directive substitution
for macro identifier locations.
We first check whether a location is inside a file included through
built-in header. If so we check whether line directive maps it back to
the main file, and afterwards use TokenBuffers to find exact location of
the identifier on the line.
Instead of performing the mapping in locateMacroAt, we could also store
a mapping inside the ParsedAST whenever we use a patched preamble. But
that would imply adding more responsibility to ParsedAST and paying for
the mapping even when it is not going to be used.
====
Go-To-Definition:
Later on these locations are used for serving go-to-definition requests,
this enables jumping to definition inside the preamble section in
presence of patched macros.
=====
Go-To-Refs:
Macro references in main file are collected separetely and stored as a
map from macro's symbol id to reference ranges. Those ranges are
computed inside PPCallbacks, hence we don't have access to TokenBuffer.
In presence of preamble patch, any reference to a macro inside the
preamble section will unfortunately have the wrong range. They'll point
into the patch rather than the main file. Hence during findReferences,
we won't get any ranges reported for those.
Fixing those requires:
- Lexing the preamble section to figure out "real range" of a patched
macro definition
- Postponing range/location calculations until a later step in which we
have access to tokenbuffers.
This patch trades some accuracy in favor of code complexity. We don't do
any patching for references inside the preamble patch but get any
reference inside the main file for free.
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D80198
Summary:
Depends on D79930.
This enables more accurate parsing of the AST, by making new macro
definitions in preamble section visible. This is handled by injecting
define directives into preamble patch.
This patch doesn't handle any location mappings yet, so features like go-to-def,
go-to-refs and hover might not work as expected. These will be addressed in a
follow-up patch.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D79992
Summary:
Enables building ASTs with stale preambles by handling additional preamble
includes. Sets the correct location information for those imaginary includes so
that features like gotodef/documentlink keeps functioning propoerly.
Reviewers: sammccall
Subscribers: ilya-biryukov, MaskRay, jkorous, mgrang, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77644
Summary:
This is achieved by calculating newly added includes and implicitly
parsing them as if they were part of the main file.
This also gets rid of the need for consistent preamble reads.
Reviewers: sammccall
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, mgrang, arphaman, jfb, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D77392
Summary:
This is another step for out-of-order preamble builds. To keep the
diagnostic behavior same, we only build ASTs either with "usable" preambles,
the ones that are fully applicable to a given ParseInput, or after building a
new preamble. Which is the same behaviour as what we do today. ASTs
built in the latter is called golden ASTs.
Reviewers: sammccall
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76725
Summary:
First step to enable deferred preamble builds. Not intending to land it
alone, will have follow-ups that will implement full deferred build
functionality and will land after all of them are ready.
Reviewers: sammccall
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D76125
Summary:
This ties to an LSP feature (diagnostic versioning) but really a lot
of the value is in being able to log what's happening with file versions
and queues more descriptively and clearly.
As such it's fairly invasive, for a logging patch :-\
Key decisions:
- at the LSP layer, we don't reqire the client to provide versions (LSP
makes it mandatory but we never enforced it). If not provided,
versions start at 0 and increment. DraftStore handles this.
- don't propagate magically using contexts, but rather manually:
addDocument -> ParseInputs -> (ParsedAST, Preamble, various callbacks)
Context-propagation would hide the versions from ClangdServer, which
would make producing good log messages hard
- within ClangdServer, treat versions as opaque and unordered.
std::string is a convenient type for this, and allows richer versions
for embedders. They're "mandatory" but "null" is a reasonable default.
Subscribers: ilya-biryukov, javed.absar, MaskRay, jkorous, arphaman, kadircet, usaxena95, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D75582
Summary:
- store all macro references in the ParsedAST;
- unify the two variants of CollectMainFileMacros;
Reviewers: ilya-biryukov
Subscribers: MaskRay, jkorous, mgrang, arphaman, kadircet, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D67496
llvm-svn: 372725