128 Commits

Author SHA1 Message Date
Owen Pan
eaff083035
[clang-format] Fix more bugs in isStartOfName() (#72336)
Fixed #72264.
2023-11-15 14:31:30 -08:00
Gedare Bloom
a852869398 [clang-format] Fix a bug in parsing function/variable names
Function and variable names are not detected when there is a
__attribute__((x)) preceding the name.

Fixes #64137.

Differential Revision: https://reviews.llvm.org/D156370
2023-11-13 19:35:28 -08:00
Emilia Kond
063e3fe648
[clang-format] Skip PP directives when determining brace kind (#69473)
Pull request #65409 changed the brace kind heuristic to not treat a
preprocessor if directive as a in statement, however, this caused some
regressions.

If the contents of a brace don't immediately determine the brace kind,
the heuristic will look at the characters immediately before and after
the closing brace to determine the brace type.

Unfortunately, if the closing brace was preceded by a preprocessor
directive, for example `#endif`, the preceding token was `endif`, seen
as just an identifier, so the braces were understood as a braced list.

This patch fixes this behaviour by skipping all preprocessor directives
when calculating brace types. Comments were already being skipped, so
now preprocessor lines are skipped alongside comments.

Fixes https://github.com/llvm/llvm-project/issues/68404
2023-11-03 05:51:20 +02:00
Björn Schäpers
c280bed85a
[clang-format] Fix annotating annotations after requires clause
Fixes #69325.
2023-10-31 13:10:46 +01:00
Owen Pan
e00d32afb9
[clang-format][NFC] Delete TT_LambdaArrow (#70519)
It's one type of TT_TrailingReturnArrow.
2023-10-28 12:48:00 -07:00
Owen Pan
88934a82dc [clang-format][NFC] Remove extraneous newlines in some unit test files 2023-10-26 21:43:27 -07:00
Björn Schäpers
a95d4b7950
[clang-format] Annotate do while while
So we can differentiate on the while keyword between a do-while-loop and
a normal while-loop.
2023-10-20 21:46:40 +02:00
Owen Pan
0ae998c4ae
[clang-format] Fix a bug in annotating TrailingReturnArrow (#69249)
Skip TrailingAnnotation when looking for TrailingReturnArrow.

Fixes #69234.
2023-10-18 14:04:51 -07:00
Emilia Kond
5f4ed780d3
[clang-format] Allow default values for template parameters in lambda (#69052)
Previously, upon encountering an equals sign while parsing a lambda in
the UnwrappedLineParser, it would fall through and fail. This caused any
lambda template with a default argument for a template parameter to be
annotated as an ArraySubscriptLSquare.

This patch allows equals signs in the UnwrappedLineParser if we're
currently in a template parameter list. This resolved a FIXME that was
in the lambda parsing function.

This patch seems deceptively easy, it's likely it doesn't solve the
FIXME entirely, or causes other issues (the FIXME itself mentions
something about Objective-C, which I cannot comment about). However this
patch is sufficient to fix the below issue.

Fixes https://github.com/llvm/llvm-project/issues/68913

---------

Co-authored-by: Owen Pan <owenpiano@gmail.com>
2023-10-17 00:38:33 +03:00
Jared Grubb
7f881a2abe [clang-format] Treat AttributeMacro more like __attribute__
There are two parts to this fix:
- Annotate the paren after an AttributeMacro as an AttributeLParen.
- Treat an AttributeMacro-without-paren the same as one with a paren.

I added a new test-case to differentiate a macro that is or is-not an
AttributeMacro; also handled whether ColumnLimit is set to infinite (0) or a
finite value, as part of this patch is in ContinuationIndenter.

Closes #68722.

Differential Revision: https://reviews.llvm.org/D145262
2023-10-15 15:58:24 -07:00
Owen Pan
6c7cf74a75 Revert "[clang-format] Treat AttributeMacro more like __attribute__"
This reverts commit 6f46bcc609f14121e6942763ba9871f98541ea0e.
2023-10-15 15:52:17 -07:00
Jared Grubb
6f46bcc609 [clang-format] Treat AttributeMacro more like __attribute__
There are two parts to this fix:
- Annotate the paren after an AttributeMacro as an AttributeLParen.
- Treat an AttributeMacro-without-paren the same as one with a paren.

I added a new test-case to differentiate a macro that is or is-not an
AttributeMacro; also handled whether ColumnLimit is set to infinite (0) or a
finite value, as part of this patch is in ContinuationIndenter.

Closes #68722.

Differential Revision: https://reviews.llvm.org/D145262
2023-10-15 15:44:57 -07:00
Björn Schäpers
a8896e57f1
[clang-format][NFC] Annotate control statement r_braces (#68621)
Annotating switch braces for the first time.

Also in preparation of #67906.
2023-10-13 21:07:33 +02:00
Björn Schäpers
b7abab2f28
Annotate enum r brace
In preparation of #67906.
2023-10-09 19:53:03 +02:00
Björn Schäpers
38c31d7e31
[clang-format][NFC] Annotate more r_braces
In preparation of #67906.
2023-10-09 19:52:15 +02:00
Owen Pan
7e856d1894 Reland "[clang-format] Annotate ctors/dtors as CtorDtorDeclName instead (#67955)"
Reland 6a621ed8e4cb which failed on Windows but not macOS.

The failures were caused by moving the annotation of the children from the
top to the bottom of TokenAnnotator::annotate(), resulting in invalid lines
including incomplete ones being skipped.
2023-10-03 22:26:48 -07:00
Owen Pan
d08fcc817e Revert "[clang-format] Annotate ctors/dtors as CtorDtorDeclName instead (#67955)"
This reverts commit 6a621ed8e4cb02bd55fe4a4a0254615576b70a55 as it caused
buildbots to fail.
2023-10-03 18:19:23 -07:00
Owen Pan
6a621ed8e4
[clang-format] Annotate ctors/dtors as CtorDtorDeclName instead (#67955)
After annotating constructors/destructors as FunctionDeclarationName in
commit 08630512088, we have seen several issues because ctors/dtors had
been treated differently than functions in aligning, wrapping, and
indenting.

This patch annotates ctors/dtors as CtorDtorDeclName instead and would
effectively revert commit 0468fa07f87f, which is obsolete now.

Fixed #67903.
Fixed #67907.
2023-10-03 18:02:09 -07:00
Owen Pan
c83d64f17a
[clang-format] Fix a bug in mis-annotating arrows (#67780)
Fixed #66923.
2023-10-02 00:38:26 -07:00
Owen Pan
151d0a4db3
[clang-format] Handle __attribute/__declspec/AttributeMacro consistently (#67518) 2023-09-29 12:03:08 -07:00
Emilia Kond
3e87829a8a
[clang-format] Fix requires misannotation with comma (#65908)
clang-format uses a heuristic to determine if a requires() is either a
requires clause or requires expression, based on what is in the
parentheses. Part of this heuristic assumed that a requires clause can
never contain a comma, however this is not the case if said comma is in
the template argument of a type.

This patch allows commas to appear in a requires clause if an angle
bracket `<` has been opened.

Fixes https://github.com/llvm/llvm-project/issues/65904
2023-09-28 01:34:30 +03:00
Owen Pan
67b99fa8ba
[clang-format] Correctly annotate keyword operator function name (#66904)
Fixes #66890.
2023-09-27 13:12:55 -07:00
Owen Pan
cef9f40cd4
[clang-format] Split TT_AttributeParen (#67396)
Replaced TT_AttributeParen with TT_AttributeLParen and
TT_AttributeRParen.
2023-09-26 20:27:15 -07:00
Owen Pan
6e608dc44b
[clang-format] Correctly annotate return type of function pointer (#66893)
Fixes #66857.
2023-09-26 14:17:49 -07:00
Owen Pan
0b2c88eea7
[clang-format] Fix a bug in annotating && followed by * or & (#65933)
Fixes #65877.
2023-09-13 23:57:34 -07:00
sstwcw
5db201fb75
[clang-format] Stop breaking unbreakable strings in JS (#66168)
Dictionary literal keys and strings in TypeScript type declarations can
not be broken.

The problem was pointed out by @alexfh and @e-kud here:

https://reviews.llvm.org/D154093#4644512
2023-09-13 17:04:04 +02:00
Emilia Kond
e9ed1aa9cd
[clang-format] Correctly annotate designated initializer with PP if (#65409)
When encountering braces, such as those of a designated initializer,
clang-format scans ahead to see what is contained within the braces. If
it found a statement, like an if-statement of for-loop, it would deem
the braces as not an initializer, but as a block instead.

However, this heuristic incorrectly included a preprocessor `#if` line
as an if-statement. This manifested in strange results and discrepancies
between `#ifdef` and `#if defined`.

With this patch, `if` is now ignored if it is preceeded by `#`.

Fixes most of https://github.com/llvm/llvm-project/issues/56685
2023-09-07 22:23:05 +03:00
Björn Schäpers
4f7086ce33
[clang-format] Fix misannotation of && before noexcept (#65526)
When we are in an expression, it has to be a binary operator and not
pointer or reference.
2023-09-07 12:44:26 +02:00
sstwcw
ddc80637cc [clang-format] Break long string literals in C#, etc.
Now strings that are too long for one line in C#, Java, JavaScript, and
Verilog get broken into several lines.  C# and JavaScript interpolated
strings are not broken.

A new subclass BreakableStringLiteralUsingOperators is used to handle
the logic for adding plus signs and commas.  The updateAfterBroken
method was added because now parentheses or braces may be required after
the parentheses or commas are added.  In order to decide whether the
added plus sign should be unindented in the BreakableToken object, the
logic for it is taken out into a separate function
shouldUnindentNextOperator.

The logic for finding the continuation indentation when the option
AlignAfterOpenBracket is set to DontAlign is not implemented yet.  So in
that case the new line may have the wrong indentation, and the parts may
have the wrong length if the string needs to be broken more than once
because finding where to break the string depends on where the string
starts.

The preambles for the C# and Java unit tests are changed to the newer
style in order to allow the 3-argument verifyFormat macro.  Some cases
are changed from verifyFormat to verifyImcompleteFormat because those
use incomplete code and the new verifyFormat function checks that the
code is complete.

The line in the doc was changed to being indented by 4 spaces, that is,
the default continuation indentation.  It has always been the case.  It
was probably a mistake that the doc showed 2 spaces previously.

This commit was fist committed as 16ccba51072b.  The tests caused
assertion failures.  Then it was reverted in 547bce36132a.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D154093
2023-09-05 03:19:49 +00:00
Owen Pan
0863051208 Reland "[clang-format] Annotate constructor/destructor names"
(0e63f1aacc00 was reverted by 7590b7657004 due to a crash.)

Annotate constructor/destructor names as FunctionDeclarationName.

Fixes #63046.

Differential Revision: https://reviews.llvm.org/D157963
2023-08-29 13:14:52 -07:00
Kadir Cetinkaya
7590b76570
Revert "[clang-format] Annotate constructor/destructor names"
This reverts commit 0e63f1aacc0040e9a16fa2fab15a140b1686e2ab.

clang-format started to crash with contents like:
a.h:
```
```
$ clang-format a.h
```
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.      Program arguments: ../llvm/build/bin/clang-format a.h
 #0 0x0000560b689fe177 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000560b689fbfbe llvm::sys::RunSignalHandlers() /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000560b689feaca SignalHandler(int) /usr/local/google/home/kadircet/repos/llvm/llvm/lib/Support/Unix/Signals.inc:413:1
 #3 0x00007f030405a540 (/lib/x86_64-linux-gnu/libc.so.6+0x3c540)
 #4 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/include/clang/Lex/Token.h:98:44
 #5 0x0000560b68a9a980 is /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:562:51
 #6 0x0000560b68a9a980 startsSequenceInternal<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:831:9
 #7 0x0000560b68a9a980 startsSequence<clang::tok::TokenKind, clang::tok::TokenKind> /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/FormatToken.h:600:12
 #8 0x0000560b68a9a980 getFunctionName /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3131:17
 #9 0x0000560b68a9a980 clang::format::TokenAnnotator::annotate(clang::format::AnnotatedLine&) /usr/local/google/home/kadircet/repos/llvm/clang/lib/Format/TokenAnnotator.cpp:3191:17
Segmentation fault
```
2023-08-28 10:31:07 +02:00
Owen Pan
0e63f1aacc [clang-format] Annotate constructor/destructor names
Annotate constructor/destructor names as FunctionDeclarationName.

Fixes #63046.

Differential Revision: https://reviews.llvm.org/D157963
2023-08-24 01:32:52 -07:00
David Spickett
547bce3613 Revert "[clang-format] Break long string literals in C#, etc."
This reverts commit 16ccba51072bbc5ff4c66f91f939163dc91e5d96.

This is failing across Linaro's bots e.g.:
https://lab.llvm.org/buildbot/#/builders/188/builds/34393
2023-08-24 08:15:17 +00:00
sstwcw
16ccba5107 [clang-format] Break long string literals in C#, etc.
Now strings that are too long for one line in C#, Java, JavaScript, and
Verilog get broken into several lines.  C# and JavaScript interpolated
strings are not broken.

A new subclass BreakableStringLiteralUsingOperators is used to handle
the logic for adding plus signs and commas.  The updateAfterBroken
method was added because now parentheses or braces may be required after
the parentheses or commas are added.  In order to decide whether the
added plus sign should be unindented in the BreakableToken object, the
logic for it is taken out into a separate function
shouldUnindentNextOperator.

The logic for finding the continuation indentation when the option
AlignAfterOpenBracket is set to DontAlign is not implemented yet.  So in
that case the new line may have the wrong indentation, and the parts may
have the wrong length if the string needs to be broken more than once
because finding where to break the string depends on where the string
starts.

The preambles for the C# and Java unit tests are changed to the newer
style in order to allow the 3-argument verifyFormat macro.  Some cases
are changed from verifyFormat to verifyImcompleteFormat because those
use incomplete code and the new verifyFormat function checks that the
code is complete.

The line in the doc was changed to being indented by 4 spaces, that is,
the default continuation indentation.  It has always been the case.  It
was probably a mistake that the doc showed 2 spaces previously.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D154093
2023-08-24 03:16:31 +00:00
Owen Pan
5c106f7b94 [clang-format] Add TypeNames option to disambiguate types/objects
If a non-keyword identifier is found in TypeNames, then a *, &, or && that
follows it is annotated as TT_PointerOrReference.

Differential Revision: https://reviews.llvm.org/D155273
2023-07-18 14:18:40 -07:00
Owen Pan
cc2ff02ead [clang-format] Correctly annotate overloaded operator function name
The operator keyword preceded by a template closer should be annotated as
TT_FunctionDeclarationName.

Fixes #63879.

Differential Revision: https://reviews.llvm.org/D155358
2023-07-15 11:02:31 -07:00
Owen Pan
3f3620e5c9 [clang-format] Correctly annotate */&/&& in operator function calls
Reverts 4986f3f2f220 (but keeps its unit tests) and fixes #49973
differently.

Also fixes bugs that incorrectly annotate the operator keyword as
TT_FunctionDeclarationName in function calls and as TT_Unknown in function
declarations and definitions.

Differential Revision: https://reviews.llvm.org/D154184
2023-07-03 17:49:10 -07:00
Emilia Kond
4986f3f2f2
[clang-format] Correctly annotate operator free function call
The annotator correctly annotates an overloaded operator call when
called as a member function, like `x.operator+(y)`, however, when called
as a free function, like `operator+(x, y)`, the annotator assumed it was
an overloaded operator function *declaration*, instead of a call.

This patch allows for a free function call to correctly be annotated as
a call, but only if the current like cannot be a declaration, usually
within the bodies of a function.

Fixes https://github.com/llvm/llvm-project/issues/49973

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay, Nuullll

Differential Revision: https://reviews.llvm.org/D153798
2023-06-29 19:51:27 +03:00
Owen Pan
7a4cdbeafb [clang-format] Fix bugs in annotating r_paren as C-style cast
Don't annotate r_paren as TT_CastRParen if it's followed by an amp/star and
either the line is in a macro definition or a numeric_constant follows the
amp/star.

Fixes #59634.

Differential Revision: https://reviews.llvm.org/D153745
2023-06-27 16:16:36 -07:00
Emilia Kond
15e14f129f
[clang-format] Preserve AmpAmpTokenType in nested parentheses
When parsing a requires clause, the UnwrappedLineParser would delegate to
parseParens with an AmpAmpTokenType set to BinaryOperator. However,
parseParens would not carry this over into any nested parens, meaning it
could assign a different token type to an && in a requires clause.

This patch makes sure parseParens inherits its parameter when performing
a recursive call.

Fixes https://github.com/llvm/llvm-project/issues/63251

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D153641
2023-06-26 12:39:16 +03:00
Galen Elias
d39929b925 [clang-format] Adjust braced list detection (reland 6dcde65)
This is a retry of https://reviews.llvm.org/D114583, which was backed
out for regressions.

Clang Format is detecting a nested scope followed by another open brace
as a braced initializer list due to incorrectly thinking it's matching a
braced initializer at the end of a constructor initializer list which is
followed by the body open brace.

Unfortunately, UnwrappedLineParser isn't doing a very detailed parse, so
it's not super straightforward to distinguish these cases given the
current structure of calculateBraceTypes. My current hypothesis is that
these can be disambiguated by looking at the token preceding the
l_brace, as initializer list parameters will be preceded by an
identifier, but a scope block generally will not (barring the MACRO
wildcard).

To this end, I am adding tracking of the previous token to the LBraceStack
to help scope this particular case.

TokenAnnotatorTests cherry picked from https://reviews.llvm.org/D150452.

Fixes #33891.
Fixes #52911.

Differential Revision: https://reviews.llvm.org/D150403
2023-05-23 18:50:41 -07:00
Owen Pan
dc4ab97085 [clang-format] Revert 6dcde65 due to missing commit message title
This reverts commit 6dcde658b2380d7ca1451ea5d1099af3e294ea16.
2023-05-23 18:48:52 -07:00
Galen Elias
6dcde658b2 This is a retry of https://reviews.llvm.org/D114583, which was backed
out for regressions.

Clang Format is detecting a nested scope followed by another open brace
as a braced initializer list due to incorrectly thinking it's matching a
braced initializer at the end of a constructor initializer list which is
followed by the body open brace.

Unfortunately, UnwrappedLineParser isn't doing a very detailed parse, so
it's not super straightforward to distinguish these cases given the
current structure of calculateBraceTypes. My current hypothesis is that
these can be disambiguated by looking at the token preceding the
l_brace, as initializer list parameters will be preceded by an
identifier, but a scope block generally will not (barring the MACRO
wildcard).

To this end, I am adding tracking of the previous token to the LBraceStack
to help scope this particular case.

TokenAnnotatorTests cherry picked from https://reviews.llvm.org/D150452.

Fixes #33891.
Fixes #52911.

Differential Revision: https://reviews.llvm.org/D150403
2023-05-22 20:25:55 -07:00
Emilia Kond
e4d3e88802
[clang-format] Don't allow template to be preceded by closing brace
This check is similar to the right paren check right below it, but it
doesn't need the overloaded operator check.

This patch prevents brace-initialized objects that are being compared
from being mis-annotated as template parameters.

Fixes https://github.com/llvm/llvm-project/issues/57004

Reviewed By: owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D150629
2023-05-17 01:37:19 +03:00
sstwcw
e12428557a [clang-format] Recognize Verilog edge identifiers
Previously the event expression would be misidentified as a port list.
A line break would be added after the comma.  The events can be
separated with either a comma or the `or` keyword, and a line break
would not be inserted if the `or` keyword was used.  We changed the
behavior of the comma to match the `or` keyword.

Before:
```
always @(posedge x,
         posedge y)
  x <= x;
always @(posedge x or posedge y)
  x <= x;
```

After:
```
always @(posedge x, posedge y)
  x <= x;
always @(posedge x or posedge y)
  x <= x;
```

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D149561
2023-05-07 05:13:04 +00:00
sstwcw
4134f83610 [clang-format] Recognize Verilog type dimension in module header
We had the function `verilogGroupDecl` for that.  However, the type
name would be incorrectly annotated in `isStartOfName` when it was not
a C++ keyword and followed another identifier.

Reviewed By: HazardyKnusperkeks, owenpan, MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D149352
2023-04-30 22:26:53 +00:00
sstwcw
82a90caa88 [clang-format] Correctly format goto labels followed by blocks
There doesn't seem to be an issue on GitHub.  But previously, a space
would be inserted before the goto colon in the code below.

    switch (x) {
    case 0:
    goto_0: {
      action();
      break;
    }
    }

Previously, the colon following a goto label would be annotated as
`TT_InheritanceColon`.  A goto label followed by an opening brace
wasn't recognized.  It is easy to add another line to have
`spaceRequiredBefore` function recognize the case, but I believed it
is more proper to avoid doing the same thing in `UnwrappedLineParser`
and `TokenAnnotator`.  So now the label colons would be labeled in
`UnwrappedLineParser`, and `spaceRequiredBefore` would rely on that.

Previously we had the type `TT_GotoLabelColon` intended for both goto
labels and case labels.  But since handling of goto labels and case
labels differ somewhat, I split it into separate types for goto and
case labels.

This patch doesn't change the behavior for case labels.  I added the
lines annotating case labels because they would previously be
mistakenly annotated as `TT_InheritanceColon` just like goto labels.
And since I added the annotations, the checks for the `case` and
`default` keywords in `spaceRequiredBefore` are not necessary anymore.

Reviewed By: MyDeveloperDay

Differential Revision: https://reviews.llvm.org/D148484
2023-04-30 22:25:48 +00:00
mydeveloperday
799b794d76 [clang-format] C# short ternary operator misinterpreted as a CSharpNullable
Refactor the CSharpNullable assignment code to be a little easier to read (Honestly I don't like it when an if expression get really long and complicated).
Handle the case where '?' is actually a ternary operator.

Fixes: #58067

Reviewed By: owenpan, HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D148473
2023-04-18 10:17:28 +01:00
sstwcw
b6301a018d [clang-format][NFC] Output tokens on test assert
Reviewed By: rymiel

Differential Revision: https://reviews.llvm.org/D148482
2023-04-16 23:48:07 +00:00
sstwcw
0571ba8d1b [clang-format] Handle Verilog assertions and loops
Assert statements in Verilog can optionally have an else part.  We
handle them like for `if` statements, except that an `if` statement in
the else part of an `assert` statement doesn't get merged with the
`else` keyword.  Like this:

    assert (x)
      $info();
    else
      if (y)
        $info();
      else if (z)
        $info();
      else
        $info();

`foreach` and `repeat` are now handled like for or while loops.

We used the type `TT_ConditionLParen` to mark the condition part so
they are handled in the same way as the condition part of an `if`
statement.  When the code being formatted is not in Verilog, it is
only set for `if` statements, not loops.  It's because loop conditions
are currently handled slightly differently, and existing behavior is
not supposed to change.  We formatted all files ending in `.cpp` and
`.h` in the repository with and without this change.  It showed that
setting the type for `if` statements doesn't change existing behavior.

And we noticed that we forgot to make the program print the list of
tokens when the number is not correct in `TokenAnnotatorTest`.  It's
fixed now.

Reviewed By: HazardyKnusperkeks

Differential Revision: https://reviews.llvm.org/D147895
2023-04-16 21:55:50 +00:00