What is Javadoc comment

Javadoc comment is multiline comment /* */ that starts with * character and placed above class definition, interface definition, enum definition, method definition or field definition.

For example, here is java file:

/**
 * My <b>class</b>.
 * @see AbstractClass
 */
public class MyClass {

}
      
Javadoc content is:
 * My <b>class</b>.
 * @see AbstractClass
      
Attention that java comment is start with /*, following with Identificator of comment type. Javadoc Identificator is *. All symbols after Javadoc Identificator till */ are part of javadoc comment.

In internet you can find different types of documentation generation tools similar to javadoc. Such tools rely on specific Identificator: "!", "#", "$". Comments looks like "/*! some comment */" , "/*# some comment */" , "/*$ some comment */". Such multiline comments are not a javadoc.

Limitations

Javadoc by specification could contain any HTML tags that let user generate content he needs. Checkstyle can not parse something that looks like an HTML, so limitation appear. The comment should be written in XHTML to build nested AST Tree that most Checks expect. This means that every HTML tag should have matching closed HTML tag or it is self-closed tag (singleton tag). The only exceptions are HTML 4 tags that don't require closing tag and HTML 4 singleton tags, so, Checkstyle won't show error about missing closing tag, however, it leads to broken XHTML structure and to not-nested content of the HTML tags in Abstract Syntax Tree of the Javadoc comment. For More details about HTML in AST read HTML Code In Javadoc Comments section.

Javadoc grammar requires XHTML, but it can also parse some parts of HTML code (like some unclosed tags). If HTML tags are not closed Javadoc grammar cannot determine content of these tags, so structure of the parse tree will not be nested like it is while using XHTML. It is done just to not fail on every Javadoc comment, because there are tons of using unclosed tags, etc.

How to create Javadoc Check

Principle of writing Javadoc Checks is similar to writing regular Checks. You just extend another class and use another token types.

To start implementing new Check create new class and extend AbstractJavadocCheck. It has two abstract methods you should implement:

Difference between Java Grammar and Javadoc comments Grammar

Java grammar parses java file base on Java language specifications. So, there are singleline comments and multiline/block comments in it. Java compiler doesn't know about Javadoc because it is just a multiline comment. To parse multiline comment as a Javadoc comment, checkstyle has special Parser that is based on ANTLR Javadoc grammar. So, it's supposed to proccess block comments that start with Javadoc Identificator and parse them to Abstract Syntax Tree (AST).

The difference is that Java grammar uses ANTLR v2, while Javadoc grammar uses ANTLR v4. Because of that, these two grammars and their trees are not compatible. Java AST consists of DetailAST objects, while Javadoc AST consists of DetailNode objects.

Main Java grammar skips any whitespaces and newlines, so in Java Abstract Syntax Tree there are no whitespace/newline nodes. In Javadoc comment every whitespace matters, and Javadoc Checks need all those whitespaces and newline nodes to verify format and content of the Javadoc comment. Because of that Javadoc grammar includes all whitespaces, newlines to the parse tree (WS, NEWLINE). As you may notice there is also CHAR javadoc token that presents single character. It is quite useless and is used for building only TEXT node which consists of CHAR and WS nodes, but it is implementation nuance. (In future we will try to resolve this. See Github Issue #3170).

Tool to print Javadoc tree structure

Checkstyle can print Abstract Syntax Tree for Java and Javadoc trees. You need to run checkstyle jar file with -J argument, providing java file.

For example, here is MyClass.java file:

/**
 * My <b>class</b>.
 * @see AbstractClass
 */
public class MyClass {

}
      

Command:

java -jar checkstyle-X.XX-all.jar -J MyClass.java

Output:

CLASS_DEF -> CLASS_DEF [5:0]
|--MODIFIERS -> MODIFIERS [5:0]
|   |--JAVADOC -> \r\n * My <b>class</b>.\r\n * @see AbstractClass\r\n <EOF> [1:0]
|   |   |--NEWLINE -> \r\n [1:0]
|   |   |--LEADING_ASTERISK ->  * [2:0]
|   |   |--TEXT ->  My  [2:2]
|   |   |   |--WS ->   [2:2]
|   |   |   |--CHAR -> M [2:3]
|   |   |   |--CHAR -> y [2:4]
|   |   |   `--WS ->   [2:5]
|   |   |--HTML_ELEMENT -> <b>class</b> [2:6]
|   |   |   `--HTML_TAG -> <b>class</b> [2:6]
|   |   |       |--HTML_ELEMENT_OPEN -> <b> [2:6]
|   |   |       |   |--OPEN -> < [2:6]
|   |   |       |   |--HTML_TAG_NAME -> b [2:7]
|   |   |       |   `--CLOSE -> > [2:8]
|   |   |       |--TEXT -> class [2:9]
|   |   |       |   |--CHAR -> c [2:9]
|   |   |       |   |--CHAR -> l [2:10]
|   |   |       |   |--CHAR -> a [2:11]
|   |   |       |   |--CHAR -> s [2:12]
|   |   |       |   `--CHAR -> s [2:13]
|   |   |       `--HTML_ELEMENT_CLOSE -> </b> [2:14]
|   |   |           |--OPEN -> < [2:14]
|   |   |           |--SLASH -> / [2:15]
|   |   |           |--HTML_TAG_NAME -> b [2:16]
|   |   |           `--CLOSE -> > [2:17]
|   |   |--TEXT -> . [2:18]
|   |   |   `--CHAR -> . [2:18]
|   |   |--NEWLINE -> \r\n [2:19]
|   |   |--LEADING_ASTERISK ->  * [3:0]
|   |   |--WS ->   [3:2]
|   |   |--JAVADOC_TAG -> @see AbstractClass\r\n  [3:3]
|   |   |   |--SEE_LITERAL -> @see [3:3]
|   |   |   |--WS ->   [3:7]
|   |   |   |--REFERENCE -> AbstractClass [3:8]
|   |   |   |   `--CLASS -> AbstractClass [3:8]
|   |   |   |--NEWLINE -> \r\n [3:21]
|   |   |   `--WS ->   [4:0]
|   |   `--EOF -> <EOF> [4:1]
|   `--LITERAL_PUBLIC -> public [5:0]
|--LITERAL_CLASS -> class [5:7]
|--IDENT -> MyClass [5:13]
`--OBJBLOCK -> OBJBLOCK [5:21]
    |--LCURLY -> { [5:21]
    `--RCURLY -> } [7:0]
      

As you see very small java file transforms to a huge Abstract Syntax Tree, because that is the most detailed tree including all components of the java file: classes, methods, comments, etc.

In most cases while developing Javadoc Check you need only parse tree of the exact Javadoc comment. To do that just copy Javadoc comment to separate file and remove /** at the begining and */ at the end. After that, run checkstyle with -j argument.

MyJavadocComment.javadoc file:

 * My <b>class</b>.
 * @see AbstractClass
      

Command:

java -jar checkstyle-X.XX-all.jar -j MyJavadocComment.javadoc

Output:

JAVADOC ->  * My <b>class</b>.\r\n * @see AbstractClass<EOF> [0:0]
|--LEADING_ASTERISK ->  * [0:0]
|--TEXT ->  My  [0:2]
|   |--WS ->   [0:2]
|   |--CHAR -> M [0:3]
|   |--CHAR -> y [0:4]
|   `--WS ->   [0:5]
|--HTML_ELEMENT -> <b>class</b> [0:6]
|   `--HTML_TAG -> <b>class</b> [0:6]
|       |--HTML_ELEMENT_OPEN -> <b> [0:6]
|       |   |--OPEN -> < [0:6]
|       |   |--HTML_TAG_NAME -> b [0:7]
|       |   `--CLOSE -> > [0:8]
|       |--TEXT -> class [0:9]
|       |   |--CHAR -> c [0:9]
|       |   |--CHAR -> l [0:10]
|       |   |--CHAR -> a [0:11]
|       |   |--CHAR -> s [0:12]
|       |   `--CHAR -> s [0:13]
|       `--HTML_ELEMENT_CLOSE -> </b> [0:14]
|           |--OPEN -> < [0:14]
|           |--SLASH -> / [0:15]
|           |--HTML_TAG_NAME -> b [0:16]
|           `--CLOSE -> > [0:17]
|--TEXT -> . [0:18]
|   `--CHAR -> . [0:18]
|--NEWLINE -> \r\n [0:19]
|--LEADING_ASTERISK ->  * [1:0]
|--WS ->   [1:2]
|--JAVADOC_TAG -> @see AbstractClass [1:3]
|   |--SEE_LITERAL -> @see [1:3]
|   |--WS ->   [1:7]
|   `--REFERENCE -> AbstractClass [1:8]
|       `--CLASS -> AbstractClass [1:8]
`--EOF -> <EOF> [1:21]
      

Access Java AST from Javadoc Check

As you already know Javadoc parse tree is result of parsing block comment. There is a method to get the original block comment from Javadoc Check. You may need this block comment to check its position or something else in java DetailAST tree.

For example, to write a JavadocCheck that verifies @param tags in Javadoc comment of a method definition, you also need all method's parameter names. To get method definition AST you should access java DetailAST tree from javadoc Check. For this purpose use getBlockCommentAst() method that returns DetailAST node.

Example:

class MyCheck extends AbstractJavadocCheck {

    @Override
    public int[] getDefaultJavadocTokens() {
        return new int[]{JavadocTokenTypes.PARAMETER_NAME};
    }

    @Override
    public void visitJavadocToken(DetailNode paramNameNode) {
        String javadocParamName = paramNameNode.getText();
        DetailAST blockCommentAst = getBlockCommentAst();

        if (BlockCommentPosition.isOnMethod(blockCommentAst)) {

            DetailAST methodDef = blockCommentAst.getParent();
            DetailAST methodParam = findMethodParameter(methodDef);
            String methodParamName = methodParam.getText();

            if (!javadocParamName.equals(methodParamName)) {
                log(methodParam, "params.dont.match");
            }

        }
    }
}
      

HTML Code In Javadoc Comments

Checkstyle supports HTML4 tags in Javadoc comments: all HTML4 elements.

HTML 4 is picked just to have a list of tags with optional end tag and a list of tags whose end tag is forbidden (also known as empty html tags, for example BR tag).

HTML tags with optional end tag: <P>, <LI>, <TR>, <TD>, <TH>, <BODY>, <COLGROUP>, <DD>, <DT>, <HEAD>, <HTML>, <OPTION>, <TBODY>, <THEAD>, <TFOOT>.

Empty HTML tags: <AREA>, <BASE>, <BASEFONT>, <BR>, <COL>, <FRAME>, <HR>, <IMG>, <INPUT>, <ISINDEX>, <LINK>, <META>, <PARAM>.

If Checkstyle meets unknown tag (for example HTML5 tag) it doesn't fail and parses this tag as HTML_TAG Javadoc token type. Just follow XHTML rules to make Checkstyle javadoc parser make nested AST, even though tags are unknown.

<audio><source src="horse.ogg" type="audio/ogg"/></audio>
        
JAVADOC -> <audio><source src="horse.ogg" type="audio/ogg"/></audio><EOF> [0:0]
|--HTML_ELEMENT -> <audio><source src="horse.ogg" type="audio/ogg"/></audio> [0:0]
|   `--HTML_TAG -> <audio><source src="horse.ogg" type="audio/ogg"/></audio> [0:0]
|       |--HTML_ELEMENT_OPEN -> <audio> [0:0]
|       |   |--OPEN -> < [0:0]
|       |   |--HTML_TAG_NAME -> audio [0:1]
|       |   `--CLOSE -> > [0:6]
|       |--HTML_ELEMENT -> <source src="horse.ogg" type="audio/ogg"/> [0:7]
|       |   `--SINGLETON_ELEMENT -> <source src="horse.ogg" type="audio/ogg"/> [0:7]
|       |       `--SINGLETON_TAG -> <source src="horse.ogg" type="audio/ogg"/> [0:7]
|       |           |--OPEN -> < [0:7]
|       |           |--HTML_TAG_NAME -> source [0:8]
|       |           |--WS ->   [0:14]
|       |           |--ATTRIBUTE -> src="horse.ogg" [0:15]
|       |           |   |--HTML_TAG_NAME -> src [0:15]
|       |           |   |--EQUALS -> = [0:18]
|       |           |   `--ATTR_VALUE -> "horse.ogg" [0:19]
|       |           |--WS ->   [0:31]
|       |           |--ATTRIBUTE -> type="audio/ogg" [0:32]
|       |           |   |--HTML_TAG_NAME -> type [0:32]
|       |           |   |--EQUALS -> = [0:36]
|       |           |   `--ATTR_VALUE -> "audio/ogg" [0:37]
|       |           `--SLASH_CLOSE -> /> [0:49]
|       `--HTML_ELEMENT_CLOSE -> </audio> [0:51]
|           |--OPEN -> < [0:51]
|           |--SLASH -> / [0:52]
|           |--HTML_TAG_NAME -> audio [0:53]
|           `--CLOSE -> > [0:58]
`--EOF -> <EOF> [0:59]
        

Here is what you get if unknown tag doesn't have matching end tag (for example, HTML5 tag <audio>):
Input:

<audio>test
Output:
[ERROR:0] Javadoc comment at column 1 has parse error. Missed HTML close tag 'audio'. Sometimes it means that close tag missed for one of previous tags.
As you see Javadoc parser prints error and doesn't build AST if unknown HTML tag doesn't have matching end tag.

More examples:

1) Unclosed paragraph HTML tag. As you see in the tree, content of the paragraph tag is not nested to this tag. That is because HTML tags are not closed by pair tag </p>, and Checkstyle requires XHTML to predictably parse Javadoc comments. 2) Here is correct version with open and closed HTML tags.
<p> First
<p> Second
          
<p> First </p>
<p> Second </p>
          
JAVADOC -> <p> First\r\n<p> Second<EOF> [0:0]
|--HTML_ELEMENT -> <p> [0:0]
|   `--P_TAG_OPEN -> <p> [0:0]
|       |--OPEN -> < [0:0]
|       |--P_HTML_TAG_NAME -> p [0:1]
|       `--CLOSE -> > [0:2]
|--TEXT ->  First [0:3]
|   |--WS ->   [0:3]
|   |--CHAR -> F [0:4]
|   |--CHAR -> i [0:5]
|   |--CHAR -> r [0:6]
|   |--CHAR -> s [0:7]
|   `--CHAR -> t [0:8]
|--NEWLINE -> \r\n [0:9]
|--HTML_ELEMENT -> <p> [1:0]
|   `--P_TAG_OPEN -> <p> [1:0]
|       |--OPEN -> < [1:0]
|       |--P_HTML_TAG_NAME -> p [1:1]
|       `--CLOSE -> > [1:2]
|--TEXT ->  Second [1:3]
|   |--WS ->   [1:3]
|   |--CHAR -> S [1:4]
|   |--CHAR -> e [1:5]
|   |--CHAR -> c [1:6]
|   |--CHAR -> o [1:7]
|   |--CHAR -> n [1:8]
|   `--CHAR -> d [1:9]
`--EOF -> <EOF> [1:10]
          
JAVADOC -> <p> First </p>\r\n<p> Second </p><EOF> [0:0]
|--HTML_ELEMENT -> <p> First </p> [0:0]
|   `--PARAGRAPH -> <p> First </p> [0:0]
|       |--P_TAG_OPEN -> <p> [0:0]
|       |   |--OPEN -> < [0:0]
|       |   |--P_HTML_TAG_NAME -> p [0:1]
|       |   `--CLOSE -> > [0:2]
|       |--TEXT ->  First  [0:3]
|       |   |--WS ->   [0:3]
|       |   |--CHAR -> F [0:4]
|       |   |--CHAR -> i [0:5]
|       |   |--CHAR -> r [0:6]
|       |   |--CHAR -> s [0:7]
|       |   |--CHAR -> t [0:8]
|       |   `--WS ->   [0:9]
|       `--P_TAG_CLOSE -> </p> [0:10]
|           |--OPEN -> < [0:10]
|           |--SLASH -> / [0:11]
|           |--P_HTML_TAG_NAME -> p [0:12]
|           `--CLOSE -> > [0:13]
|--NEWLINE -> \r\n [0:14]
|--HTML_ELEMENT -> <p> Second </p> [1:0]
|   `--PARAGRAPH -> <p> Second </p> [1:0]
|       |--P_TAG_OPEN -> <p> [1:0]
|       |   |--OPEN -> < [1:0]
|       |   |--P_HTML_TAG_NAME -> p [1:1]
|       |   `--CLOSE -> > [1:2]
|       |--TEXT ->  Second  [1:3]
|       |   |--WS ->   [1:3]
|       |   |--CHAR -> S [1:4]
|       |   |--CHAR -> e [1:5]
|       |   |--CHAR -> c [1:6]
|       |   |--CHAR -> o [1:7]
|       |   |--CHAR -> n [1:8]
|       |   |--CHAR -> d [1:9]
|       |   `--WS ->   [1:10]
|       `--P_TAG_CLOSE -> </p> [1:11]
|           |--OPEN -> < [1:11]
|           |--SLASH -> / [1:12]
|           |--P_HTML_TAG_NAME -> p [1:13]
|           `--CLOSE -> > [1:14]
`--EOF -> <EOF> [1:15]
          

Checkstyle SDK GUI

Not implemented yet. See Github Issue #408.

Customize token types in Javadoc Checks

Not implemented yet. See Github Issue #2427.

Integrating new Javadoc Check

Javadoc Checks as well as regular Checks extend AbstractCheck class. So integrating new Javadoc Check is similar to integrating other Checks.

Declare check's external resource locations

See Declare check's external resource locations.

Examples of Javadoc Checks

The best source knowledge about how to write Javadoc Checks could be taken from existing Checks.