7 Commits

Author SHA1 Message Date
Dmitri Gribenko
f26054f0fb Enable comment parsing and semantic analysis to emit diagnostics. A few
diagnostics implemented -- see testcases.

I created a new TableGen file for comment diagnostics,
DiagnosticCommentKinds.td, because comment diagnostics don't logically
fit into AST diagnostics file.  But I don't feel strongly about it.

This also implements support for self-closing HTML tags in comment
lexer and parser (for example, <br />).

In order to issue precise diagnostics CommentSema needs to know the
declaration the comment is attached to.  There is no easy way to find a decl by 
comment, so we match comments and decls in lockstep: after parsing one
declgroup we check if we have any new, not yet attached comments.  If we do --
then we do the usual comment-finding process.

It is interesting that this automatically handles trailing comments.
We pick up not only comments that precede the declaration, but also
comments that *follow* the declaration -- thanks to the lookahead in
the lexer: after parsing the declgroup we've consumed the semicolon
and looked ahead through comments.

Added -Wdocumentation-html flag for semantic HTML errors to allow the user to 
disable only HTML warnings (but not HTML parse errors, which we emit as
warnings in -Wdocumentation).

llvm-svn: 160078
2012-07-11 21:38:39 +00:00
Dmitri Gribenko
17709ae8d9 Comment lexing: fix lexing to actually work in non-error cases.
llvm-svn: 159963
2012-07-09 21:32:40 +00:00
Dmitri Gribenko
ec92531c29 Implement AST classes for comments, a real parser for Doxygen comments and a
very simple semantic analysis that just builds the AST; minor changes for lexer
to pick up source locations I didn't think about before.

Comments AST is modelled along the ideas of HTML AST: block and inline content.

* Block content is a paragraph or a command that has a paragraph as an argument
  or verbatim command.
* Inline content is placed within some block.  Inline content includes plain
  text, inline commands and HTML as tag soup.

llvm-svn: 159790
2012-07-06 00:28:32 +00:00
Dmitri Gribenko
632d58afab Fix an infinite loop in comment lexer: we were not advancing in the input character stream when we saw a '<' that is not a start of an HTML tag.
llvm-svn: 159303
2012-06-27 23:28:29 +00:00
Dmitri Gribenko
1669f70273 Remove unsigned and a pointer from a comment token (so that each token can have only one semantic string value attached to it), at a cost of adding an additional token.
llvm-svn: 159270
2012-06-27 16:53:58 +00:00
Dmitri Gribenko
60ddd8a1b1 Comment lexer: counting backwards from token end is thought to be confusing. We already have a pointer to the beginning of the token, so use it to extract the text instead.
llvm-svn: 159269
2012-06-27 16:30:35 +00:00
Dmitri Gribenko
5188c4b9cc Implement a lexer for structured comments.
llvm-svn: 159223
2012-06-26 20:39:18 +00:00