Summary:
This patch changes the structure for raw string formatting options by making it
language based (enumerate delimiters per language) as opposed to delimiter-based
(specify the language for a delimiter). The raw string formatting now uses an
appropriate style from the .clang-format file, if exists.
Reviewers: bkramer
Reviewed By: bkramer
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D42098
llvm-svn: 322634
Adding the new enumerator forced a bunch more changes into this patch than I
would have liked. The -Wtautological-compare warning was extended to properly
check the new comparison operator, clang-format needed updating because it uses
precedence levels as weights for determining where to break lines (and several
operators increased their precedence levels with this change), thread-safety
analysis needed changes to build its own IL properly for the new operator.
All "real" semantic checking for this operator has been deferred to a future
patch. For now, we use the relational comparison rules and arbitrarily give
the builtin form of the operator a return type of 'void'.
llvm-svn: 320707
When we break a long line like:
Column limit: 21
|
// foo foo foo foo foo foo foo foo foo foo foo foo
The local decision when to allow protruding vs. breaking can lead to this
outcome (2 excess characters, 2 breaks):
// foo foo foo foo foo
// foo foo foo foo foo
// foo foo
While strictly staying within the column limit leads to this strictly better
outcome (fully below the column limit, 2 breaks):
// foo foo foo foo
// foo foo foo foo
// foo foo foo foo
To get an optimal solution, we would need to consider all combinations of excess
characters vs. breaking for all lines, but that would lead to a significant
increase in the search space of the algorithm for little gain.
Instead, we blindly try both approches and·select the one that leads to the
overall lower penalty.
Differential Revision: https://reviews.llvm.org/D40605
llvm-svn: 319541
This fixes some bugs in the reflowing logic and splits out the concerns
of reflowing from BreakableToken.
Things to do after this patch:
- Refactor the breakProtrudingToken function possibly into a class, so we
can split it up into methods that operate on the common state.
- Optimize whitespace compression when reflowing by using the next possible
split point instead of the latest possible split point.
- Retry different strategies for reflowing (strictly staying below the
column limit vs. allowing excess characters if possible).
Differential Revision: https://reviews.llvm.org/D40310
llvm-svn: 319314
Summary:
clang-format already removes empty lines at the beginning & end of
blocks:
int x() {
foo(); // lines before and after will be removed.
}
However because lamdas and arrow functions are parsed as expressions,
the existing logic to remove empty lines in UnwrappedLineFormatter
doesn't handle them.
This change special cases arrow functions in ContinuationIndenter to
remove empty lines:
x = []() {
foo(); // lines before and after will now be removed.
};
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D40178
llvm-svn: 318537
For each line that we break in a protruding token, compute whether the
penalty of breaking is actually larger than the penalty of the excess
characters. Only break if that is the case.
llvm-svn: 318515
Create more orthogonal pieces. The restructuring made it easy to try out
several alternatives to D33589, and while none of the alternatives
turned out to be the right solution, the underlying simplification of
the structure is helpful.
Differential Revision: https://reviews.llvm.org/D39900
llvm-svn: 318141
Summary:
This patch enables `BreakableToken` to manage the formatting of non-trailing
block comments. It is a refinement of https://reviews.llvm.org/D37007.
We discovered that the optimizer outsmarts us on cases where breaking the comment
costs considerably less than breaking after the comment. This patch addresses
this by ensuring that a newline is inserted between a block comment and the next
token.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D37695
llvm-svn: 315893
Summary:
This is an implementation for [bug 17362](https://bugs.llvm.org/attachment.cgi?bugid=17362) which adds support for indenting preprocessor statements inside if/ifdef/endif. This takes previous work from fmauch (https://github.com/fmauch/clang/tree/preprocessor_indent) and makes it into a full feature.
The context of this patch is that I'm a VMware intern, and I implemented this because VMware needs the feature. As such, some decisions were made based on what VMware wants, and I would appreciate suggestions on expanding this if necessary to use-cases other people may want.
This adds a new enum config option, `IndentPPDirectives`. Values are:
* `PPDIS_None` (in config: `None`):
```
#if FOO
#if BAR
#include <foo>
#endif
#endif
```
* `PPDIS_AfterHash` (in config: `AfterHash`):
```
#if FOO
# if BAR
# include <foo>
# endif
#endif
```
This is meant to work whether spaces or tabs are used for indentation. Preprocessor indentation is independent of indentation for non-preprocessor lines.
Preprocessor indentation also attempts to ignore include guards with the checks:
1. Include guards cover the entire file
2. Include guards don't have `#else`
3. Include guards begin with
```
#ifndef <var>
#define <var>
```
This patch allows `UnwrappedLineParser::PPBranchLevel` to be decremented to -1 (the initial value is -1) so the variable can be used for indent tracking.
Defects:
* This patch does not handle the case where there's code between the `#ifndef` and `#define` but all other conditions hold. This is because when the #define line is parsed, `UnwrappedLineParser::Lines` doesn't hold the previous code line yet, so we can't detect it. This is out of the scope of this patch.
* This patch does not handle cases where legitimate lines may be outside an include guard. Examples are `#pragma once` and `#pragma GCC diagnostic`, or anything else that does not change the meaning of the file if it's included multiple times.
* This does not detect when there is a single non-preprocessor line in front of an include-guard-like structure where other conditions hold because `ScopedLineState` hides the line.
* Preprocessor indentation throws off `TokenAnnotator::setCommentLineLevels` so the indentation of comments immediately before indented preprocessor lines is toggled on each run. Fixing this issue appears to be a major change and too much complexity for this patch.
Contributed by @euhlmann!
Reviewers: djasper, klimek, krasimir
Reviewed By: djasper, krasimir
Subscribers: krasimir, mzeren-vmw, cfe-commits
Tags: #clang
Differential Revision: https://reviews.llvm.org/D35955
llvm-svn: 312125
Summary:
Previously, clang-format would try to wrap template string substitutions
by indenting relative to the openening `${`. This helped with
indenting structured strings, such as strings containing HTML, as the
substitutions would be aligned according to the structure of the string.
However it turns out that the overwhelming majority of template string +
substitution usages are for substitutions into non-structured strings,
e.g. URLs or just plain messages. For these situations, clang-format
would often produce very ugly indents, in particular for strings
containing no line breaks:
return `<a href='http://google3/${file}?l=${row}'>${file}</a>(${
row
},${
col
}): `;
This change makes clang-format indent template string substitutions as
if they were string concatenation operations. It wraps +4 on overlong
lines and keeps all operands on the same line:
return `<a href='http://google3/${file}?l=${row}'>${file}</a>(${
row},${col}): `;
While this breaks some lexical continuity between the `${` and `row}`
here, the overall effects are still a huge improvement, and users can
still manually break the string using `+` if desired.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D37142
llvm-svn: 311988
This reverts commit r311457. It reveals some dormant bugs in comment
reflowing, like breaking a single line jsdoc type annotation before a
parameter into multiple lines.
llvm-svn: 311641
Summary:
This patch is an alternative to https://reviews.llvm.org/D36614, by resolving a
non-idempotency issue by breaking non-trailing comments:
Consider formatting the following code with column limit at `V`:
```
V
const /* comment comment */ A = B;
```
The comment is not a trailing comment, breaking before it doesn't bring it under
the column limit. The formatter breaks after it, resulting in:
```
V
const /* comment comment */
A = B;
```
For a next reformat, the formatter considers the comment as a trailing comment,
so it is free to break it further, resulting in:
```
V
const /* comment
comment */
A = B;
```
This patch improves the situation by directly producing the third case.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D37007
llvm-svn: 311457
Summary:
This handles a case where the trailing '*/' of a multiline jsdoc eding in a
comment pragma wouldn't be put on a new line.
Reviewers: mprobst
Reviewed By: mprobst
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D36359
llvm-svn: 310458
Summary:
This patch fixes the indentation of the code pattern `key <...>`and `key {...}` in text protos.
Previously, such line would be alinged depending on the column of the previous
colon, which usually indents too much.
I'm gonna go ahead and commit this since it's a straightforward bugfix.
Reviewers: djasper, klimek
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D36143
llvm-svn: 309941
Summary:
This patch tries to avoid binpacking when initializing lists/arrays, to allow things like:
static int types[] = {
registerType1(),
registerType2(),
registerType3(),
};
std::map<int, std::string> x = {
{ 0, "foo fjakfjaklf kljj" },
{ 1, "bar fjakfjaklf kljj" },
{ 2, "stuff fjakfjaklf kljj" },
};
This is similar to how dictionnaries are formatted, and actually corresponds to the same conditions: when initializing a container (and not just 'calling' a constructor).
Such formatting involves 2 things:
* Line breaks around the content of the block. This can be forced by adding a comma or comment after the last element
* Elements should not be binpacked
This patch considers the block is an initializer list if it either ends with a comma, or follows an assignment, which seems to provide a sensible approximation.
Reviewers: krasimir, djasper
Reviewed By: djasper
Subscribers: malcolm.parsons, klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D34238
llvm-svn: 306868
Summary:
This patch adds support for <>-style proto message fields inside proto options.
Previously these were wrongly treated as binary operators and as such were
working only by chance for a limited number of cases.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D34621
llvm-svn: 306406
Summary:
This fixes the missing space before the designated initializer when `Cpp11BracedListStyle=false` :
const struct A a = { .a = 1, .b = 2 };
^
Also, wrapping between opening brace and designated array initializers used to have an excessive penalty (like breaking between an expression and the subscript operator), leading to unexpected wrapping:
const struct Aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaa =
{[1] = aaaaaaaaaaaaaaaaaaaaaaaaaaa,
[2] = bbbbbbbbbbbbbbbbbbbbbbbbbbb};
instead of:
const struct Aaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaa = {
[1] = aaaaaaaaaaaaaaaaaaaaaaaaaaa,
[2] = bbbbbbbbbbbbbbbbbbbbbbbbbbb};
Finally, designated array initializers are not binpacked, just like designated member initializers.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, krasimir, klimek
Differential Revision: https://reviews.llvm.org/D33491
llvm-svn: 305696
c++1z adds the following constructions to the language:
if constexpr (cond)
statement1;
else if constexpr (cond)
statement2;
else if constexpr (cond)
statement3;
else
statement4;
A first version of this was proposed in reviews.llvm.org/D26953 by
Francis Visoiu Mistrih, but never commited. This patch additionally
fixes the behavior when allowing short if statements on a single line
and was authored by Jacob Bandes-Storch. Thank you to both authors.
llvm-svn: 305666
Nested literals are sometimes only indented by 2 spaces, instead of
respecting the IndentWidth option.
There are existing unit tests (FormatTestJS.ArrayLiterals) that only
pass because the style used to test them uses an IndentWidth of 2.
This change removes the magic 2 and always uses the IndentWidth.
I've added 6 tests. The first 4 of these tests fail before this change,
while the last 2 already pass, but were added just to make sure it the
change works with all types of braces.
Patch originally by Jared Neil, thanks!
Differential Revision: https://reviews.llvm.org/D33857
llvm-svn: 304791
Summary:
The previous fix to force build style wrapping if the previous token is a closing parenthesis broke a peculiar pattern where users parenthesize the function declaration in a bind call:
fn((function() { ... }).bind(this));
This restores the previous behaviour by reverting that change, but narrowing the special case for unindenting closing parentheses to those followed by semicolons and opening braces, i.e. immediate calls and function declarations.
Reviewers: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D33640
llvm-svn: 304135
Summary:
This option replaces the BreakConstructorInitializersBeforeComma option with an enum, thus introducing a mode where the colon stays on the same line as constructor declaration:
// When it fits on line:
Constructor() : initializer1(), initializer2() {}
// When it does not fit:
Constructor() :
initializer1(), initializer2()
{}
// When ConstructorInitializerAllOnOneLineOrOnePerLine = true:
Constructor() :
initializer1(),
initializer2()
{}
Reviewers: krasimir, djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D32479
llvm-svn: 303739
The change that enabled wrapping at the previous scope's indentation had
unintended side-effects in that clang-format would prefer to wrap
closing parentheses to the next line if it avoided a wrap on the next
line (assuming very narrow lines):
fooObject
.someCall(barbazbam)
.then(bam);
Would get formatted as:
fooObject.someCall(barbazbam
).then(bam);
Because the ')' is now indented at the parent level (fooObject).
Normally formatting a builder pattern style call sequence like that is
outlawed in clang-format anyway. However for JavaScript this is special
cased to support trailing .bind calls.
This change disallows this special case when following a closing ')' to
avoid the problem.
Included are some random comment fixes.
Reviewers: djasper
Subscribers: klimek, cfe-commits
Differential Revision: https://reviews.llvm.org/D33399
llvm-svn: 303557
Summary:
This patch updates the handling of multiline trailing comment sections in
import statement lines to make it more consistent with the case in general.
This includes updating the parsing logic to collect the trailing comment
sections and the formatting logic to not insert escaped newlines at the end of
comment lines in import statement lines.
Specifically, before this patch this code:
```
#include <a> // line 1
// line 2
```
will be turned into two unwrapped lines, whereas this code:
```
int i; // line 1
// line 2
```
is turned into a single unwrapped line, enabling reflowing across comments.
An example where the old behaviour is bad is when partially formatting the lines
3 to 4 of this code:
```
#include <a> // line 1
// line 2
int i;
```
which gets turned into:
```
#include <a> // line 1
// line 2
int i;
```
because the two comment lines were independent and the indent was copied.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D33351
llvm-svn: 303415
myFunction(param1, param2,);
For symmetry with other parenthesized lists ([...], {...}), clang-format should
wrap parenthesized lists one-per-line if they contain a trailing comma:
myFunction(
param1,
param2,
);
This is particularly useful in function declarations or calls with many
arguments, e.g. commonly in constructors.
Differential Revision: https://reviews.llvm.org/D33023
llvm-svn: 303049
Before:
std::function<
LoooooooooooongTemplatedType<SomeType>*(
LooooooooooooooooooooongType
type)>
function;
After:
std::function<
LoooooooooooongTemplatedType<
SomeType>*(
LooooooooooooooooongType type)>
function;
clang-format generally avoids having lines like "SomeType>*(" as they
lead to parameter lists that don't belong together to be aligned. However, in
case it is better than the alternative, which can even be violating the column
limit.
llvm-svn: 301182
Summary:
This patch makes ContinuationIndenter call breakProtrudingToken only if
NoLineBreak and NoLineBreakInOperand is false.
Previously, clang-format required two runs to converge on the following example with 24 columns:
Note that the second operand shouldn't be splitted according to NoLineBreakInOperand, but the
token breaker doesn't take that into account:
```
func(a, "long long long long", c);
```
After first run:
```
func(a, "long long "
"long long",
c);
```
After second run, where NoLineBreakInOperand is taken into account:
```
func(a,
"long long "
"long long",
c);
```
With the patch, clang-format now obtains in one run:
```
func(a,
"long long long"
"long",
c);
```
which is a better token split overall.
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D30575
llvm-svn: 297274
Summary:
This patch enables comment reflowing of lines not matching the comment pragma regex
in multiline comments containing comment pragma lines. Previously, these comments
were dumped without being reindented to the result.
Reviewers: djasper, mprobst
Reviewed By: mprobst
Subscribers: klimek, mprobst, cfe-commits
Differential Revision: https://reviews.llvm.org/D30697
llvm-svn: 297261
r289428 added a separate language kind for Objective-C, but kept many
"Language == LK_Cpp" checks untouched. This introduced a "IsCpp()"
method that returns true for both C++ and Objective-C++, and replaces
all comparisons of Language with LK_Cpp with calls to this new method.
Also add a lot more test coverge for formatting things in LK_ObjC mode,
by having FormatTest's verifyFormat() test for LK_ObjC everything that's
being tested for LK_Cpp at the moment.
Fixes PR32060 and many other things.
llvm-svn: 296160
Specifically, similar to other blocks, clang-format now wraps both
after "${" and before the corresponding "}", if the contained
expression spans multiple lines.
llvm-svn: 295663
Summary:
The comment reflower wasn't taking comment pragmas as reflow stoppers. This patch fixes that.
source:
```
// long long long long
// IWYU pragma:
```
format with column limit = 20 before:
```
// long long long
// long IWYU pragma:
```
format with column limit = 20 after:
```
// long long long
// long
// IWYU pragma:
```
Reviewers: djasper
Reviewed By: djasper
Subscribers: cfe-commits, klimek
Differential Revision: https://reviews.llvm.org/D29450
llvm-svn: 293898
Without alignment, there is no clean separation between the arguments, even if
there are only two.
Before:
aaaaaaaaaaaaaa(
aaaaaaaaaaaa, aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
After:
aaaaaaaaaaaaaa(aaaaaaaaaaaa,
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa +
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa);
llvm-svn: 293875
This rows back on r288120, r291801 and r292110. I apologize in advance
for the churn. All of those revisions where meant to make the wrapping
of RHS expressions more consistent. However, now that they are
consistent, we seem to be a bit too eager.
The reasoning here is that I think it is generally correct that we want
to line-wrap before multiline RHS expressions (or multiline arguments to
a function call). However, if there are only two of such operands or
arguments, there is always a clear vertical separation between them and
the additional line break seems much less desirable.
Somewhat good examples are expressions like:
EXPECT_EQ(2, someLongExpression(
orCall));
llvm-svn: 293752
This only affects expressions inside ${} scopes of template strings.
Here, we want to indent relative to the surrounding template string and
not the surrounding expression. Otherwise, this can create quite a mess.
Before:
var f = `
aaaaaaaaaaaaaaaaaa: ${someFunction(
aaaaa + //
bbbb)}`;
After:
var f = `
aaaaaaaaaaaaaaaaaa: ${someFunction(
aaaaa + //
bbbb)}`;
llvm-svn: 293636
The main motivation behind this is to cleanup the WhitespaceManager and
make it more extensible for future alignment etc. features.
Specifically, WhitespaceManager has started to copy more and more code
that is already present in FormatToken. Instead, I think it makes more
sense to actually store a reference to each FormatToken for each change.
This has as a consequence led to a change in the calculation of indent
levels. Now, we actually compute them for each Token ahead of time,
which should be more efficient as it removes an unsigned value for the
ParenState, which is used during the combinatorial exploration of the
solution space.
No functional changes intended.
Review: https://reviews.llvm.org/D29300
llvm-svn: 293616
This had significant negative consequences and I don't have a good
solution for it yet.
Before:
var string =
[
'aaaaaa',
'bbbbbb',
]
.join('+');
After:
var string = [
'aaaaaa',
'bbbbbb',
].join('+');
llvm-svn: 293465
Summary:
This presents a version of the comment reflowing with less mutable state inside
the comment breakable token subclasses. The state has been pushed into the
driving breakProtrudingToken method. For this, the API of BreakableToken is enriched
by the methods getSplitBefore and getLineLengthAfterSplitBefore.
Reviewers: klimek
Reviewed By: klimek
Subscribers: djasper, klimek, mgorny, cfe-commits, ioeric
Differential Revision: https://reviews.llvm.org/D28764
llvm-svn: 293055