109 Commits

Author SHA1 Message Date
Owen Pan
b2082a9817 Revert "[clang-format][NFC] Delete 100+ redundant #include lines in .cpp files"
This reverts commit b92d6dd704d789240685a336ad8b25a9f381b4cc. See
github.com/llvm/llvm-project/commit/b92d6dd704d7#commitcomment-139992444

We should use a tool like Visual Studio to clean up the headers.
2024-03-19 21:28:22 -07:00
Owen Pan
6f31cf51df Revert "[clang-format][NFC] Eliminate the IsCpp parameter in all functions (#84599)"
This reverts c3a1eb6207d8 (and the related commit f3c5278efa3b) which makes
cleanupAroundReplacements() no longer thread-safe.
2024-03-19 18:06:59 -07:00
Owen Pan
b92d6dd704 [clang-format][NFC] Delete 100+ redundant #include lines in .cpp files 2024-03-16 22:24:11 -07:00
Owen Pan
c3a1eb6207 Reland [clang-format][NFC] Eliminate the IsCpp parameter in all functions (#84599)
Initialize IsCpp in LeftRightQualifierAlignmentFixer ctor.
2024-03-14 19:44:40 -07:00
Mehdi Amini
b0d1e32ca2
Revert "[clang-format][NFC] Eliminate the IsCpp parameter in all functions" (#85353)
Reverts llvm/llvm-project#84599

This broke the presubmit bot.
2024-03-14 19:33:11 -07:00
Owen Pan
0c07102927
[clang-format][NFC] Eliminate the IsCpp parameter in all functions (#84599) 2024-03-14 18:56:24 -07:00
Owen Pan
61c83e9491 Revert "[clang-format][NFC] Make LangOpts global in namespace Format"
This reverts commit 32e65b0b8a743678974c7ca7913c1d6c41bb0772.

It seems to break some PowerPC bots.

See https://github.com/llvm/llvm-project/pull/81390#issuecomment-1941964803.
2024-02-13 21:02:14 -08:00
Hirofumi Nakamura
6a471611a4
[clang-format] Support of TableGen value annotations. (#80299)
This implements the annotation of the values in TableGen.
The main changes are,

- parseTableGenValue(), the simplified parser method for the syntax of
values.
- modified consumeToken() to parseTableGenValue in 'if', 'assert' and
after '='.
- modified parseParens() to call parseTableGenValue inside.
- modified parseSquare() to to call parseTableGenValue inside, with
skipping separator tokens.
- modified parseAngle() to call parseTableGenValue inside, with skipping
separator tokens.
2024-02-12 23:27:09 +09:00
Owen Pan
32e65b0b8a Reland "[clang-format][NFC] Make LangOpts global in namespace Format (#81390)"
Restore getFormattingLangOpts().
2024-02-11 22:01:23 -08:00
Owen Pan
3dc8ef677d Revert "[clang-format][NFC] Make LangOpts global in namespace Format (#81390)"
This reverts commit 03f571995b4f0c260254955afd16ec44d0764794.

We can't hide getFormattingLangOpts() as it's used by other tools.
2024-02-11 13:08:28 -08:00
Owen Pan
03f571995b
[clang-format][NFC] Make LangOpts global in namespace Format (#81390) 2024-02-11 12:59:05 -08:00
Owen Pan
5609bd83c3 Revert "[clang-format] Update FormatToken::isSimpleTypeSpecifier() (#80241)"
This reverts commit 763139afc19ddf2e0f0265dc828ce8e5fbe92530.

It seems that LangOpts is not initialized before use.
2024-02-09 01:52:41 -08:00
Owen Pan
763139afc1
[clang-format] Update FormatToken::isSimpleTypeSpecifier() (#80241)
Now with a8279a8bc541, we can make the update.
2024-02-08 21:42:29 -08:00
Kazu Hirata
b67ce7e349 [clang] Use StringRef::starts_with (NFC) 2024-01-31 23:54:09 -08:00
Hirofumi Nakamura
0058263600
[clang-format] Support of TableGen tokens with unary operator like form, bang operators and numeric literals. (#78996)
Adds the support for tokens that have forms like unary operators.
- bang operators:  `!name`
- cond operator: `!cond`
- numeric literals: `+1`, `-1`
cond operator are one of bang operators but is distinguished because it has very specific syntax.
2024-01-31 00:30:37 +09:00
Hirofumi Nakamura
fcb6737f82
[clang-format] Support of TableGen identifiers beginning with a number. (#78571)
TableGen allows the identifiers beginning with a number.
This patch add the support of the recognition of such identifiers.
2024-01-20 21:15:58 +09:00
Hirofumi Nakamura
e3702f6225
[clang-format] TableGen multi line string support. (#78032)
Support the handling of TableGen's multiline string (code) literal.
That has the form, 
[{ this is the string possibly with multi line... }]
2024-01-17 21:20:35 +09:00
Hirofumi Nakamura
0cc31579e0
[clang-format] TableGen keywords support. (#77477)
Add TableGen keywords to the additional keyword list of the formatter.

This pull request is the splited part from
https://github.com/llvm/llvm-project/pull/76059 .
2024-01-11 20:07:49 +01:00
Kazu Hirata
f3dcc2351c
[clang] Use StringRef::{starts,ends}_with (NFC) (#75149)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 08:54:13 -08:00
Owen Pan
4c17452076
[clang-format][NFC] Extend isProto() to also cover LK_TextProto (#73582) 2023-11-29 12:52:01 -08:00
sstwcw
3af82b3962
[clang-format] Add spaces around the Verilog implication operator (#71352)
The Verilog implication operator `->` is a binary operator meaning
either the left hand side is false or the right hand side is true.
Previously it was treated as the C++ struct member operator.

I didn't even know it existed when I added the operator formatting part.
And I didn't check all the tests for all the operators I added. That is
how the bad test got in.
2023-11-29 15:17:59 +00:00
Owen Pan
91c4db0061 [clang-format][NFC] Replace !is() with isNot()
Differential Revision: https://reviews.llvm.org/D158571
2023-08-24 01:27:24 -07:00
Owen Pan
5c106f7b94 [clang-format] Add TypeNames option to disambiguate types/objects
If a non-keyword identifier is found in TypeNames, then a *, &, or && that
follows it is annotated as TT_PointerOrReference.

Differential Revision: https://reviews.llvm.org/D155273
2023-07-18 14:18:40 -07:00
Owen Pan
682808d9c9 Reland [clang-format] Add a space between an overloaded operator and '>'
The token annotator doesn't annotate the template opener and closer
as such if they enclose an overloaded operator. This causes the
space between the operator and the closer to be removed, resulting
in invalid C++ code.

Fixes #58602.

Differential Revision: https://reviews.llvm.org/D143755
2023-03-20 03:01:22 -07:00
Kadir Cetinkaya
696f8b32d4
Revert "[clang-format] Add a space between an overloaded operator and '>'"
This reverts commit b05dc1b8766a47482cae432011fd2faa04c83a3e.

Makes clang-format crash on `struct Foo { operator enum foo{} };`
2023-03-20 08:07:44 +01:00
Owen Pan
b05dc1b876 [clang-format] Add a space between an overloaded operator and '>'
The token annotator doesn't annotate the template opener and closer
as such if they enclose an overloaded operator. This causes the
space between the operator and the closer to be removed, resulting
in invalid C++ code.

Fixes #58602.

Differential Revision: https://reviews.llvm.org/D143755
2023-02-16 20:25:39 -08:00
Owen Pan
25e2d0f3c8 [clang-format] Support clang-format on/off line comments as prefix
Closes #60264.

Differential Revision: https://reviews.llvm.org/D142804
2023-02-01 13:07:09 -08:00
Sam McCall
882a05afa1 [Format] Fix crash when hitting eof while lexing JS template string
Different loop termination conditions resulted in confusion of whether
*Offset was intended to be inside or outside the token.
This ultimately led to constructing an out-of-range SourceLocation.

Fix by making Offset consistently point *after* the token.

Differential Revision: https://reviews.llvm.org/D135356
2022-10-06 17:00:41 +02:00
owenca
b60e7a7f1a [clang-format] Handle C# interpolated verbatim string prefix @$
Fixes #58062.

Differential Revision: https://reviews.llvm.org/D135026
2022-10-04 18:27:36 -07:00
Fangrui Song
3f18f7c007 [clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC
With C++17 there is no Clang pedantic warning or MSVC C5051.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D131346
2022-08-08 09:12:46 -07:00
Kazu Hirata
71336d03f1 Use llvm::any_of (NFC) 2022-07-31 15:17:08 -07:00
sstwcw
f93182a887 [clang-format] Handle Verilog numbers and operators
Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D126845
2022-07-29 00:38:29 +00:00
owenca
0ffb3dd33e [clang-format] Fix a hang when formatting C# $@ string literals
Fixes #56624.

Differential Revision: https://reviews.llvm.org/D130411
2022-07-25 23:17:54 -07:00
Kevin Cadieux
a9bef0707d [clang-format] Fix incorrect isspace input (NFC)
This change fixes a clang-format unit test failure introduced by [D124748](https://reviews.llvm.org/D124748). The `countLeadingWhitespace` function was calling `isspace` with values that could fall outside the valid input range. The valid input range for `isspace` is unsigned 0-255. Values outside this range produce undefined behavior, which on Windows manifests as an assertion being raised in the debug runtime libraries. `countLeadingWhitespace` was calling `isspace` with a signed char that could produce a negative value if the underlying byte's value was 128 or above, which can happen for non-ASCII encodings. The fix is to use `StringRef`'s `bytes_begin` and `bytes_end` iterators to read the values as unsigned chars instead.

This bug can be reproduced by building the `check-clang-unit` target with a DEBUG configuration under Windows. This change is already covered by existing unit tests.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D128786
2022-06-29 10:20:46 -07:00
sstwcw
141ad3ba05 [clang-format] Fix uninitialized memory problem
The setLength function checks for the token kind which could be
uninitialized in the previous version.

The problem was introduced in 2e32ff106e.

Reviewed By: MyDeveloperDay, owenpan

Differential Revision: https://reviews.llvm.org/D128607
2022-06-26 22:23:50 +00:00
sstwcw
2e32ff106e [clang-format] Handle Verilog preprocessor directives
Verilog uses the backtick instead of the hash.  In this revision
backticks are lexed manually and then get labeled as hashes so the logic
for handling C preprocessor stuff don't have to change.  Hashes get
labeled as identifiers for Verilog-specific stuff like delays.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D124749
2022-06-26 02:02:29 +00:00
sstwcw
370bee4801 [clang-format] Fix whitespace counting stuff
The current way of counting whitespace would count backticks as
whitespace.  For Verilog stuff we need backticks to be handled
correctly.  For JavaScript the current way is to compare the entire
token text to see if it's a backtick.  However, when the backtick is the
first token following an escaped newline, the escaped newline will be
part of the tok::unknown token.  Verilog has macros and escaped newlines
unlike JavaScript.  So we can't regard an entire tok::unknown token as
whitespace.  Previously, the start of every token would be matched for
newlines.  Now, it is all whitespace instead of just newlines.

The column counting problem has already been fixed for JavaScript in
e71b4cbdd140f059667f84464bd0ac0ebc348387 by counting columns elsewhere.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D124748
2022-06-26 01:27:27 +00:00
owenca
bebf7bdf9a [clang-format][NFC] Insert/remove braces in clang/lib/Format/
Differential Revision: https://reviews.llvm.org/D126157
2022-05-24 19:06:04 -07:00
Marek Kurdej
573a5b5800 Revert "[clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro closing parenthesis is followed by a newline."
This reverts commit 50cd52d9357224cce66a9e00c9a0417c658a5655.

It provoked regressions in C++ and ObjectiveC as described in https://reviews.llvm.org/D123676#3515949.

Reproducers:
```
MACRO_BEGIN
#if A
int f();
#else
int f();
#endif
```

```
NS_SWIFT_NAME(A)
@interface B : C
@property(readonly) D value;
@end
```
2022-05-18 07:27:45 +02:00
Marek Kurdej
50cd52d935 [clang-format] Fix WhitespaceSensitiveMacros not being honoured when macro closing parenthesis is followed by a newline.
Fixes https://github.com/llvm/llvm-project/issues/54522.

This fixes regression introduced in 5e5efd8a91.

Before the culprit commit, macros in WhitespaceSensitiveMacros were correctly formatted even if their closing parenthesis weren't followed by semicolon (or, to be precise, when they were followed by a newline).
That commit changed the type of the macro token type from TT_UntouchableMacroFunc to TT_FunctionLikeOrFreestandingMacro.

Correct formatting (with `WhitespaceSensitiveMacros = ['FOO']`):
```
FOO(1+2)
FOO(1+2);
```

Regressed formatting:
```
FOO(1 + 2)
FOO(1+2);
```

Reviewed By: HazardyKnusperkeks, owenpan, ksyx

Differential Revision: https://reviews.llvm.org/D123676
2022-05-09 10:59:33 +02:00
Dawid Jurczak
a64d3c602f [NFC][Lexer] Make Lexer::LangOpts const reference
This change can be seen as code cleanup but motivation is more performance related.
While browsing perf reports captured during Linux build we can notice unusual portion of instructions executed in std::vector<std::string> copy constructor like:

0.59%     0.58%  clang-14    clang-14      [.] std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
                                                                std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector

or even:

1.42%     0.26%  clang    clang-14             [.] clang::LangOptions::LangOptions
       |
        --1.16%--clang::LangOptions::LangOptions
                  |
                   --0.74%--std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >,
                            std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >::vector

After more digging we can see that relevant LangOptions std::vector members (*Files, ModuleFeatures and NoBuiltinFuncs)
are constructed when Lexer::LangOpts field is initialized on list:

Lexer::Lexer(..., const LangOptions &langOpts, ...)
            : ..., LangOpts(langOpts),

Since LangOptions copy constructor is called by Lexer(..., const LangOptions &LangOpts,...) and local Lexer objects are created thousands times
(in Lexer::getRawToken, Preprocessor::EnterSourceFile and more) during single module processing in frontend it makes std::vector copy constructors surprisingly hot.

Unfortunately even though in current Lexer implementation mentioned std::vector members are unused and most of time empty,
no compiler is smart enough to optimize their std::vector copy constructors out (take a look at test assembly): https://godbolt.org/z/hdoxPfMYY even with LTO enabled.
However there is simple way to fix this. Since Lexer doesn't access *Files, ModuleFeatures, NoBuiltinFuncs and any other LangOptions fields (but only LangOptionsBase)
we can simply get rid of redundant copy constructor assembly by changing LangOpts type to more appropriate const LangOptions reference: https://godbolt.org/z/fP7de9176

Additionally we need to store LineComment outside LangOpts because it's written in SkipLineComment function.
Also FormatTokenLexer need to be adjusted a bit to avoid lifetime issues related to passing local LangOpts reference to Lexer.

After this change I can see more than 1% speedup in some of my microbenchmarks when using Clang release binary built with LTO.
For Linux build gains are not so significant but still nice at the level of -0.4%/-0.5% instructions drop.

Differential Revision: https://reviews.llvm.org/D120334
2022-02-28 15:42:19 +01:00
Marek Kurdej
fee4a9712f [clang-format] Use FormatToken::is* functions without passing through Tok. NFC. 2022-02-22 16:41:15 +01:00
Marek Kurdej
7d5062c6ac [clang-format] Remove unnecessary parentheses in return statements. NFC. 2022-02-12 21:25:52 +01:00
Marek Kurdej
d079995dd0 [clang-format] Elide unnecessary braces. NFC. 2022-02-02 15:28:53 +01:00
Marek Kurdej
10243d0dfd [clang-format] Simplify use of StringRef::substr(). NFC. 2022-02-02 14:36:00 +01:00
owenca
c95afac89e [clang-format][NFC] Clean up tryMergeLessLess()
Differential Revision: https://reviews.llvm.org/D117759
2022-01-20 14:35:07 -08:00
Marek Kurdej
82452be5cb [clang-format] Refactor: add FormatToken::hasWhitespaceBefore(). NFC.
This factors out a pattern that comes up from time to time.

Reviewed By: MyDeveloperDay, HazardyKnusperkeks, owenpan

Differential Revision: https://reviews.llvm.org/D117769
2022-01-20 21:16:17 +01:00
Jino Park
560eb2277b [clang-format] Fix bug in parsing operator< with template
Fixes https://github.com/llvm/llvm-project/issues/44601.

This patch handles a bug when parsing a below example code :

```
template <class> class S;

template <class T> bool operator<(S<T> const &x, S<T> const &y) {
  return x.i < y.i;
}

template <class T> class S {
  int i = 42;
  friend bool operator< <>(S const &, S const &);
};

int main() { return S<int>{} < S<int>{}; }
```
which parse `< <>` as `<< >`, not `< <>` in terms of tokens as discussed in discord.

1. Add a condition in `tryMergeLessLess()` considering `operator` keyword and `>`
2. Force to leave a whitespace between `tok::less` and a template opener
3. Add unit test

Reviewed By: MyDeveloperDay, curdeius

Differential Revision: https://reviews.llvm.org/D117398
2022-01-20 08:59:04 +01:00
mydeveloperday
142e79b868 [clang-format] NFC use recently added Style.isJavaScript()
Improve the readability of these if(Style==FormatStyle::LK_JavsScript) clauses
2021-12-21 14:24:12 +00:00
Manuel Klimek
d688b31628 Fix segfault in clang-format.
Fix bug where we'd read past the end of the tokens after merging _T
macro strings.
2021-12-01 11:57:41 +01:00