Javadoc comment is multiline comment that starts with * character and placed under class definition, interface definition, enum definition, method definition or field definition. The comment should be written in XHTML to be correctly processed by Checkstyle. This means that every HTML tag should have matching closed HTML tag or it is self-closed one (singlton tag). The only exceptions are <p>, <li>, <tr>, <td>, <th>, <body>, <colgroup>, <dd>, <dt>, <head>, <html>, <option>, <tbody>, <thead>, <tfoot> and Checkstyle won't show error about missing closing tag, however, it leads to broken XHTML structure and, therefore, incorrect Abstract Syntax Tree of the Javadoc comment anyway. See examples at "HTML Code In Javadoc Comments" chapter.
To start implementing your own Check create new class and extend AbstractJavadocCheck. It has two abstract methods:
Javadoc parser requires XHTML to be used in Javadoc comments, i.e. if there is some open tag(for example <div>) then there have to be its close tag </div>. This means that if Javadoc comment has incorrect XHTML structure then Javadoc Parser will fail processing the comment, therefore, your new Check can't get its parse tree and process anything from this Javadoc comment. For more details and examples go to "HTML code in Javadoc comments" section.
Java grammar parses java file due to language specifications. So, there are singleline comments and multiline/block comments in it. Java compiler doesn't know about Javadoc because it is just a multiline comment. To parse multiline comment as a Javadoc comment, checkstyle has second grammar - Javadoc grammar. So, it's supposed to proccess block comments and parse them to Abstract Syntax Tree. The problem is that Java grammar is old one and uses ANTLR v2, while Javadoc grammar uses ANTLR v4. Because of that, these two grammars and their trees are not compatible. Java AST consists of DetailAST objects, while Javadoc AST consists of DetailNode objects.
Checkstyle can print Abstract Syntax Tree including Javadoc trees. You need to run checkstyle jar file with -J argument, providing java file.
For example, here is java file:
/** * My <b>class</b>. * @see AbstractClass */ public class MyClass { }
Command:
java -jar checkstyle-6.18-all.jar -J MyClass.java
Output:
CLASS_DEF -> CLASS_DEF [5:0] |--MODIFIERS -> MODIFIERS [5:0] | |--JAVADOC -> \r\n * My <b>class</b>.\r\n * @see AbstractClass\r\n <EOF> [1:0] | | |--NEWLINE -> \r\n [1:0] | | |--LEADING_ASTERISK -> * [2:0] | | |--TEXT -> My [2:2] | | | |--WS -> [2:2] | | | |--CHAR -> M [2:3] | | | |--CHAR -> y [2:4] | | | `--WS -> [2:5] | | |--HTML_ELEMENT -> <b>class</b> [2:6] | | | `--HTML_TAG -> <b>class</b> [2:6] | | | |--HTML_ELEMENT_OPEN -> <b> [2:6] | | | | |--OPEN -> < [2:6] | | | | |--HTML_TAG_NAME -> b [2:7] | | | | `--CLOSE -> > [2:8] | | | |--TEXT -> class [2:9] | | | | |--CHAR -> c [2:9] | | | | |--CHAR -> l [2:10] | | | | |--CHAR -> a [2:11] | | | | |--CHAR -> s [2:12] | | | | `--CHAR -> s [2:13] | | | `--HTML_ELEMENT_CLOSE -> </b> [2:14] | | | |--OPEN -> < [2:14] | | | |--SLASH -> / [2:15] | | | |--HTML_TAG_NAME -> b [2:16] | | | `--CLOSE -> > [2:17] | | |--TEXT -> . [2:18] | | | `--CHAR -> . [2:18] | | |--NEWLINE -> \r\n [2:19] | | |--LEADING_ASTERISK -> * [3:0] | | |--WS -> [3:2] | | |--JAVADOC_TAG -> @see AbstractClass\r\n [3:3] | | | |--SEE_LITERAL -> @see [3:3] | | | |--WS -> [3:7] | | | |--REFERENCE -> AbstractClass [3:8] | | | | `--CLASS -> AbstractClass [3:8] | | | |--NEWLINE -> \r\n [3:21] | | | `--WS -> [4:0] | | `--EOF -> <EOF> [4:1] | `--LITERAL_PUBLIC -> public [5:0] |--LITERAL_CLASS -> class [5:7] |--IDENT -> MyClass [5:13] `--OBJBLOCK -> OBJBLOCK [5:21] |--LCURLY -> { [5:21] `--RCURLY -> } [7:0]
As you see very small java file transforms to a huge Abstract Syntax Tree, because that is the most detailed tree including all components of the java file: classes, methods, comments, etc. But in most cases while developing Javadoc Check you need only parse tree of the exact Javadoc comment. To do that just copy Javadoc comment to separate file and remove /** at the begining and */ at the end. After that, run checkstyle with -j argument.
File:
* My <b>class</b>. * @see AbstractClass
Command:
java -jar checkstyle-6.18-SNAPSHOT-all.jar -j MyJavadocComment.javadoc
Output:
JAVADOC -> * My <b>class</b>.\r\n * @see AbstractClass<EOF> [0:0] |--LEADING_ASTERISK -> * [0:0] |--TEXT -> My [0:2] | |--WS -> [0:2] | |--CHAR -> M [0:3] | |--CHAR -> y [0:4] | `--WS -> [0:5] |--HTML_ELEMENT -> <b>class</b> [0:6] | `--HTML_TAG -> <b>class</b> [0:6] | |--HTML_ELEMENT_OPEN -> <b> [0:6] | | |--OPEN -> < [0:6] | | |--HTML_TAG_NAME -> b [0:7] | | `--CLOSE -> > [0:8] | |--TEXT -> class [0:9] | | |--CHAR -> c [0:9] | | |--CHAR -> l [0:10] | | |--CHAR -> a [0:11] | | |--CHAR -> s [0:12] | | `--CHAR -> s [0:13] | `--HTML_ELEMENT_CLOSE -> </b> [0:14] | |--OPEN -> < [0:14] | |--SLASH -> / [0:15] | |--HTML_TAG_NAME -> b [0:16] | `--CLOSE -> > [0:17] |--TEXT -> . [0:18] | `--CHAR -> . [0:18] |--NEWLINE -> \r\n [0:19] |--LEADING_ASTERISK -> * [1:0] |--WS -> [1:2] |--JAVADOC_TAG -> @see AbstractClass [1:3] | |--SEE_LITERAL -> @see [1:3] | |--WS -> [1:7] | `--REFERENCE -> AbstractClass [1:8] | `--CLASS -> AbstractClass [1:8] `--EOF -> <EOF> [1:21]
Examples: 1) Unclosed paragraph HTML tag. As you see in the tree, content of the paragraph tag is not nested to this tag. That is because HTML tags are not closed by pair tag </p>, and Checkstyle requires XHTML to correctly parse Javadoc comments.
<p> First <p> Second
JAVADOC -> <p> First\r\n<p> Second<EOF> [0:0] |--HTML_ELEMENT -> <p> [0:0] | `--P_TAG_OPEN -> <p> [0:0] | |--OPEN -> < [0:0] | |--P_HTML_TAG_NAME -> p [0:1] | `--CLOSE -> > [0:2] |--TEXT -> First [0:3] | |--WS -> [0:3] | |--CHAR -> F [0:4] | |--CHAR -> i [0:5] | |--CHAR -> r [0:6] | |--CHAR -> s [0:7] | `--CHAR -> t [0:8] |--NEWLINE -> \r\n [0:9] |--HTML_ELEMENT -> <p> [1:0] | `--P_TAG_OPEN -> <p> [1:0] | |--OPEN -> < [1:0] | |--P_HTML_TAG_NAME -> p [1:1] | `--CLOSE -> > [1:2] |--TEXT -> Second [1:3] | |--WS -> [1:3] | |--CHAR -> S [1:4] | |--CHAR -> e [1:5] | |--CHAR -> c [1:6] | |--CHAR -> o [1:7] | |--CHAR -> n [1:8] | `--CHAR -> d [1:9] `--EOF -> <EOF> [1:10]
2) Here is correct version with open and closed HTML tags.
<p> First </p> <p> Second </p>
JAVADOC -> <p> First </p>\r\n<p> Second </p><EOF> [0:0] |--HTML_ELEMENT -> <p> First </p> [0:0] | `--PARAGRAPH -> <p> First </p> [0:0] | |--P_TAG_OPEN -> <p> [0:0] | | |--OPEN -> < [0:0] | | |--P_HTML_TAG_NAME -> p [0:1] | | `--CLOSE -> > [0:2] | |--TEXT -> First [0:3] | | |--WS -> [0:3] | | |--CHAR -> F [0:4] | | |--CHAR -> i [0:5] | | |--CHAR -> r [0:6] | | |--CHAR -> s [0:7] | | |--CHAR -> t [0:8] | | `--WS -> [0:9] | `--P_TAG_CLOSE -> </p> [0:10] | |--OPEN -> < [0:10] | |--SLASH -> / [0:11] | |--P_HTML_TAG_NAME -> p [0:12] | `--CLOSE -> > [0:13] |--NEWLINE -> \r\n [0:14] |--HTML_ELEMENT -> <p> Second </p> [1:0] | `--PARAGRAPH -> <p> Second </p> [1:0] | |--P_TAG_OPEN -> <p> [1:0] | | |--OPEN -> < [1:0] | | |--P_HTML_TAG_NAME -> p [1:1] | | `--CLOSE -> > [1:2] | |--TEXT -> Second [1:3] | | |--WS -> [1:3] | | |--CHAR -> S [1:4] | | |--CHAR -> e [1:5] | | |--CHAR -> c [1:6] | | |--CHAR -> o [1:7] | | |--CHAR -> n [1:8] | | |--CHAR -> d [1:9] | | `--WS -> [1:10] | `--P_TAG_CLOSE -> </p> [1:11] | |--OPEN -> < [1:11] | |--SLASH -> / [1:12] | |--P_HTML_TAG_NAME -> p [1:13] | `--CLOSE -> > [1:14] `--EOF -> <EOF> [1:15]