The commit adds serialization and de-serialization implementations for
the stored regions. Basically, the serialized representation of the
regions of a PP is a (ordered) sequence of source location encodings.
For de-serialization, regions from loaded files are stored by their ASTs.
When later one queries if a loaded location L is in an opt-out
region, PP looks up the regions of the loaded AST where L is at.
(Background if helps: a pair of `#pragma clang unsafe_buffer_usage begin/end` pragmas marks a
warning-opt-out region. The begin and end locations (opt-out regions)
are stored in preprocessor instances (PP) and will be queried by the
`-Wunsafe-buffer-usage` analyzer.)
The reported issue at upstream: https://github.com/llvm/llvm-project/issues/90501
rdar://124035402
Add some primitive syntax highlighting to our code snippet output.
This adds "checkpoints" to the Preprocessor, which we can use to start lexing from. When printing a code snippet, we lex from the nearest checkpoint and highlight the tokens based on their token type.
Previous representation used an enumeration combined to a switch to
dispatch to the appropriate lexer.
Use function pointer so that the dispatching is just an indirect call,
which is actually better because lexing is a costly task compared to a
function call.
This also makes the code slightly cleaner, speedup on compile time
tracker are consistent and range form -0.05% to -0.20% for NewPM-O0-g,
see
https://llvm-compile-time-tracker.com/compare.php?from=f9906508bc4f05d3950e2219b4c56f6c078a61ef&to=608c85ec1283638db949d73e062bcc3355001ce4&stat=instructions:u
Considering just the preprocessing task, preprocessing the sqlite
amalgametion takes -0.6% instructions (according to valgrind
--tool=callgrind)
---------
Co-authored-by: serge-sans-paille <sguelton@mozilla.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
`ModuleDeclState` is incorrectly changed to `NamedModuleImplementation`
for `struct module {}; void foo(module a);`. This is mostly benign but
leads to a spurious warning after #69555.
A real world example is:
```
// pybind11.h
class module_ { ... };
using module = module_;
// tensorflow
void DefineMetricsModule(pybind11::module main_module);
// `module main_module);` incorrectly changes `ModuleDeclState` to `NamedModuleImplementation`
#include <algorithm> // spurious warning
```
This fixes many unit tests when trying to enable IncrementalExtensions
by default for testing purposes.
Differential Revision: https://reviews.llvm.org/D158415
This new method repeatedly calls Lex() until end of file is reached
and optionally fills a std::vector of Tokens. Use it in Clang's unit
tests to avoid quite some code duplication.
Differential Revision: https://reviews.llvm.org/D158413
The preprocessor `IncrementalProcessing` option was being used to
control whether or not to teardown the lexer or run the end of
translation unit action. In D127284 this was merged with
`-fincremental-extensions`, which also changes top level parsing.
Split these again so that the former behavior can be achieved without
the latter (ie. to allow managing lifetime without also changing
parsing).
Resolves rdar://113406310.
Setting __FLT_EVAL_METHOD__ to -1 with fast-math will set
__GLIBC_FLT_EVAL_METHOD to 2 and long double ends up being used for
float_t and double_t. This creates some ABI breakage with various C libraries.
See details here: https://github.com/llvm/llvm-project/issues/60781
This reverts commit bbf0d1932a3c1be970ed8a580e51edf571b80fd5.
As the diagnostic message shows, we should remove -fmodules-ts flag in
clang/llvm17. Since clang/llvm16 is already branched. We can remove the
depreacared flag now.
This patch prepares the necessary interfaces in the preprocessor part
for D137527 since we need to recognize if we're in a module unit, the
module kinds and the module declaration and the module we're importing
in the preprocessor.
Differential Revision: https://reviews.llvm.org/D137526
Add a pair of clang pragmas:
- `#pragma clang unsafe_buffer_usage begin` and
- `#pragma clang unsafe_buffer_usage end`,
which specify the start and end of an (unsafe buffer checking) opt-out
region, respectively.
Behaviors of opt-out regions conform to the following rules:
- No nested nor overlapped opt-out regions are allowed. One cannot
start an opt-out region with `... unsafe_buffer_usage begin` but never
close it with `... unsafe_buffer_usage end`. Mis-use of the pragmas
will be warned.
- Warnings raised from unsafe buffer operations inside such an opt-out
region will always be suppressed. This behavior CANNOT be changed by
`clang diagnostic` pragmas or command-line flags.
- Warnings raised from unsafe operations outside of such opt-out
regions may be reported on declarations inside opt-out
regions. These warnings are NOT suppressed.
- An un-suppressed unsafe operation warning may be attached with
notes. These notes are NOT suppressed as well regardless of whether
they are in opt-out regions.
The implementation maintains a separate sequence of location pairs
representing opt-out regions in `Preprocessor`. The `UnsafeBufferUsage`
analyzer reads the region sequence to check if an unsafe operation is
in an opt-out region. If it is, discard the warning raised from the
operation immediately.
This is a re-land after I reverting it at 9aa00c8a306561c4e3ddb09058e66bae322a0769.
The compilation error should be resolved.
Reviewed by: NoQ
Differential revision: https://reviews.llvm.org/D140179
Add a pair of clang pragmas:
- `#pragma clang unsafe_buffer_usage begin` and
- `#pragma clang unsafe_buffer_usage end`,
which specify the start and end of an (unsafe buffer checking) opt-out
region, respectively.
Behaviors of opt-out regions conform to the following rules:
- No nested nor overlapped opt-out regions are allowed. One cannot
start an opt-out region with `... unsafe_buffer_usage begin` but never
close it with `... unsafe_buffer_usage end`. Mis-use of the pragmas
will be warned.
- Warnings raised from unsafe buffer operations inside such an opt-out
region will always be suppressed. This behavior CANNOT be changed by
`clang diagnostic` pragmas or command-line flags.
- Warnings raised from unsafe operations outside of such opt-out
regions may be reported on declarations inside opt-out
regions. These warnings are NOT suppressed.
- An un-suppressed unsafe operation warning may be attached with
notes. These notes are NOT suppressed as well regardless of whether
they are in opt-out regions.
The implementation maintains a separate sequence of location pairs
representing opt-out regions in `Preprocessor`. The `UnsafeBufferUsage`
analyzer reads the region sequence to check if an unsafe operation is
in an opt-out region. If it is, discard the warning raised from the
operation immediately.
Reviewed by: NoQ
Differential revision: https://reviews.llvm.org/D140179
The needed tweaks are mostly trivial, the one nasty bit is Clang's usage
of OptionalStorage. To keep this working old Optional stays around as
clang::CustomizableOptional, with the default Storage removed.
Optional<File/DirectoryEntryRef> is replaced with a typedef.
I tested this with GCC 7.5, the oldest supported GCC I had around.
Differential Revision: https://reviews.llvm.org/D140332
This reverts commit 8f0df9f3bbc6d7f3d5cbfd955c5ee4404c53a75d.
The Optional*RefDegradesTo*EntryPtr types want to keep the same size as
the underlying type, which std::optional doesn't guarantee. For use with
llvm::Optional, they define their own storage class, and there is no way
to do that in std::optional.
On top of that, that commit broke builds with older GCCs, where
std::optional was not trivially copyable (static_assert in the clang
sources was failing).
This list was originally used for to make sure MacroInfo's clever memory
management got called (1f1e4bdbf7815c), but that was
simplified in 73a29662b9bf640a and 1f1e4bdbf7815c, and there's nothing left.
Differential Revision: https://reviews.llvm.org/D136725
This patch fixes compilation failure with explicit modules caused by scanner not reporting the module map describing the module whose implementation is being compiled.
Below is a breakdown of the attached test case. Note the VFS that makes frameworks "A" and "B" share the same header "shared/H.h".
In non-modular build, Clang skips the last import, since the "shared/H.h" header has already been included.
During scan (or implicit build), the compiler handles "tu.m" as follows:
* `@import B` imports module "B", as expected,
* `#import <A/H.h>` is resolved textually (due to `-fmodule-name=A`) to "shared/H.h" (due to the VFS remapping),
* `#import <B/H.h>` is resolved to import module "A_Private", since the header "shared/H.h" is already known to be part of that module, and the import is skipped.
In the end, the only modular dependency of the TU is "B".
In explicit modular build without `-fmodule-name=A`, TU does depend on module "A_Private" properly, not just textually. Clang therefore builds & loads its PCM, and knows to ignore the last import, since "shared/H.h" is known to be part of "A_Private".
But with current scanner behavior and `-fmodule-name=A` present, the last import fails during explicit build. Clang doesn't know about "A_Private" (it's included textually) and tries to import "B_Private" instead, which it doesn't know about either (the scanner correctly didn't report it as dependency). This is fixed by reporting the module map describing "A" and matching the semantics of implicit build.
Reviewed By: Bigcheese
Differential Revision: https://reviews.llvm.org/D134222
The patch diagnoses an identifier as a future keyword if it exists in a
future language mode, such as:
int restrict;
in C modes earlier than C99. We now give a warning to the user that
such an identifier is a future keyword. Handles keywords from C as well
as C++.
Differential Revision: https://reviews.llvm.org/D131683
This addresses [cpp.include]/7
(when encountering #include header-name)
If the header identified by the header-name denotes an importable header, it
is implementation-defined whether the #include preprocessing directive is
instead replaced by an import directive.
In this implementation, include translation is performed _only_ for headers
in the Global Module fragment, so:
```
module;
#include "will-be-translated.h" // IFF the header unit is available.
export module M;
#include "will-not-be-translated.h" // even if the header unit is available
```
The reasoning is that, in general, includes in the module purview would not
be validly translatable (they would have to immediately follow the module
decl and without any other intervening decls). Otherwise that would violate
the rules on contiguous import directives.
This would be quite complex to track in the preprocessor, and for relatively
little gain (the user can 'import "will-not-be-translated.h";' instead.)
TODO: This is one area where it becomes increasingly difficult to disambiguate
clang modules in C++ from C++ standard modules. That needs to be addressed in
both the driver and the FE.
Differential Revision: https://reviews.llvm.org/D128981
When running `clang -E -Ofast` on macOS, the `__FLT_EVAL_METHOD__` macro is `0`, which causes the following typedef to be emitted into the preprocessed source: `typedef float float_t`.
However, when running `clang -c -Ofast`, `__FLT_EVAL_METHOD__` is `-1`, and `typedef long double float_t` is emitted.
This causes build errors for certain projects, which are not reproducible when compiling from preprocessed source.
The issue is that `__FLT_EVAL_METHOD__` is configured in `Sema::Sema` which is not executed when running in `-E` mode.
This change moves that logic into the preprocessor initialization method, which is invoked correctly in `-E` mode.
rdar://96134605
rdar://92748429
Differential Revision: https://reviews.llvm.org/D128814
"Ascii" StringLiteral instances are actually narrow strings
that are UTF-8 encoded and do not have an encoding prefix.
(UTF8 StringLiteral are also UTF-8 encoded strings, but with
the u8 prefix.
To avoid possible confusion both with actuall ASCII strings,
and with future works extending the set of literal encodings
supported by clang, this rename StringLiteral::isAscii() to
isOrdinary(), matching C++ standard terminology.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D128762
This is a commit with the following changes:
* Remove `ExcludedPreprocessorDirectiveSkipMapping` and related functionality
Removes `ExcludedPreprocessorDirectiveSkipMapping`; its intended benefit for fast skipping of excluded directived blocks
will be superseded by a follow-up patch in the series that will use dependency scanning lexing for the same purpose.
* Refactor dependency scanning to produce pre-lexed preprocessor directive tokens, instead of minimized sources
Replaces the "source minimization" mechanism with a mechanism that produces lexed dependency directives tokens.
* Make the special lexing for dependency scanning a first-class feature of the `Preprocessor` and `Lexer`
This is bringing the following benefits:
* Full access to the preprocessor state during dependency scanning. E.g. a component can see what includes were taken and where they were located in the actual sources.
* Improved performance for dependency scanning. Measurements with a release+thin-LTO build shows ~ -11% reduction in wall time.
* Opportunity to use dependency scanning lexing to speed-up skipping of excluded conditional blocks during normal preprocessing (as follow-up, not part of this patch).
For normal preprocessing measurements show differences are below the noise level.
Since, after this change, we don't minimize sources and pass them in place of the real sources, `DependencyScanningFilesystem` is not technically necessary, but it has valuable performance benefits for caching file `stat`s along with the results of scanning the sources. So the setup of using the `DependencyScanningFilesystem` during a dependency scan remains.
Differential Revision: https://reviews.llvm.org/D125486
Differential Revision: https://reviews.llvm.org/D125487
Differential Revision: https://reviews.llvm.org/D125488
This patch replaces the exact include count of each file in `HeaderFileInfo` with a set of included files in `Preprocessor`.
The number of includes isn't a property of a header file but rather a preprocessor state. The exact number of includes is not used anywhere except statistic tracking.
Reviewed By: vsapsai
Differential Revision: https://reviews.llvm.org/D114095
The `{HeaderSearch,Preprocessor}::LookupFile()` functions take an out-parameter `const DirectoryLookup *&`. Most callers end up creating a `const DirectoryLookup *` variable that's otherwise unused.
This patch changes the out-parameter from reference to a pointer, making it possible to simply pass `nullptr` to the function without the ceremony.
Reviewed By: ahoppen
Differential Revision: https://reviews.llvm.org/D117312
This fixes an LLDB build failure where the `ImportLoc` argument is missing: https://lab.llvm.org/buildbot#builders/68/builds/19975
This change also makes it possible to drop `SourceLocation()` in `Preprocessor::getCurrentModule`.
This patch propagates the import `SourceLocation` into `HeaderSearch::lookupModule`. This enables remarks on search path usage (implemented in D102923) to point to the source code that initiated header search.
Reviewed By: dexonsmith
Differential Revision: https://reviews.llvm.org/D111557
This patch adds a new preprocessor extension ``#pragma clang final``
which enables warning on undefinition and re-definition of macros.
The intent of this warning is to extend beyond ``-Wmacro-redefined`` to
warn against any and all alterations to macros that are marked `final`.
This warning is part of the ``-Wpedantic-macros`` diagnostics group.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D108567
This change would treat the token `or` in system headers as an
identifier, and elsewhere as an operator. As reported in
llvm.org/pr42427, many users classify their third party library headers
as "system" headers to suppress warnings. There's no clean way to
separate Windows SDK headers from user headers.
Clang is still able to parse old Windows SDK headers if C++ operator
names are disabled. Traditionally this was controlled by
`-fno-operator-names`, but is now also enabled with `/permissive` since
D103773. This change will prevent `clang-cl` from parsing <query.h> from
the Windows SDK out of the box, but there are multiple ways to work
around that:
- Pass `/clang:-fno-operator-names`
- Pass `/permissive`
- Pass `-DQUERY_H_RESTRICTION_PERMISSIVE`
In all of these modes, the operator names will consistently be available
or not available, instead of depending on whether the code is in a
system header.
I added a release note for this, since it may break straightforward
users of the Windows SDK.
Fixes PR42427
Differential Revision: https://reviews.llvm.org/D108720
This patch adds `#pragma clang restrict_expansion ` to enable flagging
macros as unsafe for header use. This is to allow macros that may have
ABI implications to be avoided in headers that have ABI stability
promises.
Using macros in headers (particularly public headers) can cause a
variety of issues relating to ABI and modules. This new pragma logs
warnings when using annotated macros outside the main source file.
This warning is added under a new diagnostics group -Wpedantic-macros
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D107095
This patch adds `#pragma clang deprecated` to enable deprecation of
preprocessor macros.
The macro must be defined before `#pragma clang deprecated`. When
deprecating a macro a custom message may be optionally provided.
Warnings are emitted at the use site of a deprecated macro, and can be
controlled via the `-Wdeprecated` warning group.
This patch takes some rough inspiration and a few lines of code from
https://reviews.llvm.org/D67935.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D106732
This patch adds the -fminimize-whitespace with the following effects:
* If combined with -E, remove as much non-line-breaking whitespace as
possible.
* If combined with -E -P, removes as much whitespace as possible,
including line-breaks.
The motivation is to reduce the amount of insignificant changes in the
preprocessed output with source files where only whitespace has been
changed (add/remove comments, clang-format, etc.) which is in particular
useful with ccache.
A patch for ccache for using this flag has been proposed to ccache as well:
https://github.com/ccache/ccache/pull/815, which will use
-fnormalize-whitespace when clang-13 has been detected, and additionally
uses -P in "unify_mode". ccache already had a unify_mode in an older
version which was removed because of problems that using the
preprocessor itself does not have (such that the custom tokenizer did
not recognize C++11 raw strings).
This patch slightly reorganizes which part is responsible for adding
newlines that are required for semantics. It is now either
startNewLineIfNeeded() or MoveToLine() but never both; this avoids the
ShouldUpdateCurrentLine workaround and avoids redundant lines being
inserted in some cases. It also fixes a mandatory newline not inserted
after a _Pragma("...") that is expanded into a #pragma.
Reviewed By: aaron.ballman
Differential Revision: https://reviews.llvm.org/D104601
WG14 adopted N2645 and WG21 EWG has accepted P2334 in principle (still
subject to full EWG vote + CWG review + plenary vote), which add
support for #elifdef as shorthand for #elif defined and #elifndef as
shorthand for #elif !defined. This patch adds support for the new
preprocessor directives.