Class JavadocMetadataScraper
- java.lang.Object
-
- com.puppycrawl.tools.checkstyle.AbstractAutomaticBean
-
- com.puppycrawl.tools.checkstyle.api.AbstractViolationReporter
-
- com.puppycrawl.tools.checkstyle.api.AbstractCheck
-
- com.puppycrawl.tools.checkstyle.checks.javadoc.AbstractJavadocCheck
-
- com.puppycrawl.tools.checkstyle.meta.JavadocMetadataScraper
-
- All Implemented Interfaces:
Configurable
,Contextualizable
public class JavadocMetadataScraper extends AbstractJavadocCheck
Class for scraping module metadata from the corresponding class' class-level javadoc.
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.puppycrawl.tools.checkstyle.AbstractAutomaticBean
AbstractAutomaticBean.OutputStreamOptions
-
-
Field Summary
Fields Modifier and Type Field Description private static Pattern
DEFAULT_VALUE_TAG
Regular expression for property default value location in class-level javadocs.private static Pattern
DESC_CLEAN
Regular expression for removal of @code{-} present at the beginning of texts.private static Pattern
EXAMPLES_TAG
Regular expression for check example location in class-level javadocs.private int
exampleSectionStartIdx
Child number of the example section node, where parent is the class level javadoc root node.private static Pattern
FILE_SEPARATOR_PATTERN
Regular expression for file separator corresponding to the host OS.private static String
JAVA_FILE_EXTENSION
Java file extension.private static Map<String,ModuleDetails>
MODULE_DETAILS_STORE
Module details store used for testing.private ModuleDetails
moduleDetails
ModuleDetails instance for each module AST traversal.static String
MSG_DESC_MISSING
A key is pointing to the warning message text in "messages.properties" file.private static Pattern
PARENT_TAG
Regular expression for module parent location in class-level javadocs.private int
parentSectionStartIdx
Child number of the parent section node, where parent is the class level javadoc root node.private static String
PROP_DEFAULT_VALUE_MISSING
Format for exception message for missing default value for check property.private static String
PROP_TYPE_MISSING
Format for exception message for missing type for check property.private static Set<String>
PROPERTIES_TO_NOT_WRITE
This set contains faulty property default value which should not be written to the XML metadata files.private static Pattern
PROPERTY_TAG
Regular expression for property location in class-level javadocs.private int
propertySectionStartIdx
Child number of the property section node, where parent is the class level javadoc root node.private static Pattern
QUOTE_PATTERN
Regular expression for quotes.private DetailNode
rootNode
DetailNode pointing to the root node of the class level javadoc of the class.private boolean
scrapingViolationMessageList
Boolean variable which lets us know whether violation message section is being scraped currently.private static Pattern
TOKEN_TEXT_PATTERN
Regular expression for detecting ANTLR tokens(for e.g.private boolean
toScan
Boolean variable which lets us know whether we should scan and scrape the current javadoc or not.private static Pattern
TYPE_TAG
Regular expression for property type location in class-level javadocs.private static Pattern
VALIDATION_TYPE_TAG
Regular expression for property validation type location in class-level javadocs.private static Pattern
VIOLATION_MESSAGES_TAG
Regular expression for module violation messages location in class-level javadocs.private boolean
writeXmlOutput
Control whether to write XML output or not.-
Fields inherited from class com.puppycrawl.tools.checkstyle.checks.javadoc.AbstractJavadocCheck
MSG_JAVADOC_MISSED_HTML_CLOSE, MSG_JAVADOC_PARSE_RULE_ERROR, MSG_JAVADOC_WRONG_SINGLETON_TAG, MSG_KEY_UNCLOSED_HTML_TAG
-
-
Constructor Summary
Constructors Constructor Description JavadocMetadataScraper()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
beginJavadocTree(DetailNode rootAst)
Called before the starting to process a tree.private static String
cleanDefaultTokensText(String initialText)
Clean up the default token text by removing hyperlinks, and only keeping token type text.private static String
constructSubTreeText(DetailNode node, int childLeftLimit, int childRightLimit)
Performs a DFS of the subtree with a node as the root and constructs the text of that tree, ignoring JavadocToken texts.private static ModulePropertyDetails
createProperties(DetailNode nodeLi)
Create the modulePropertyDetails content.void
finishJavadocTree(DetailNode rootAst)
Called after finished processing a tree.int[]
getDefaultJavadocTokens()
Returns the default javadoc token types a check is interested in.private String
getDescriptionText()
Create the description text with starting index as 0 and ending index would be the first valid non-zero index amongst in the order ofpropertySectionStartIdx
,exampleSectionStartIdx
andparentSectionStartIdx
.private static Optional<DetailNode>
getFirstChildOfMatchingText(DetailNode node, Pattern pattern)
Get first child of parent node matching the provided pattern.private static Optional<DetailNode>
getFirstChildOfType(DetailNode node, int tokenType, int offset)
Returns the first child node which matches the providedTokenType
and has the children index after the offset value.static Map<String,ModuleDetails>
getModuleDetailsStore()
Getter method formoduleDetailsStore
.private String
getModuleSimpleName()
Extract simple file name from the whole file path name.private ModuleType
getModuleType()
Get module type(check/filter/filefilter) based on file name.private static String
getPackageName(String filePath)
Retrieve package name of module from the absolute file path.private static DetailAST
getParent(DetailAST commentBlock)
Returns parent node, removing modifier/annotation nodes.private static int
getParentIndexOf(DetailNode node)
Traverse parents until we reach the root node (@code{JavadocTokenTypes.JAVADOC}) child and return its index.private static String
getParentText(DetailNode nodeParagraph)
Get module parent text from paragraph javadoc node.private static String
getPropertyDefaultText(DetailNode nodeLi, DetailNode defaultValueNode)
Create property default text, which is either normal property value or list of tokens.int[]
getRequiredJavadocTokens()
The javadoc tokens that this check must be registered for.private static String
getTagTextFromProperty(DetailNode nodeLi, DetailNode propertyMeta)
Get tag text from property data.private static String
getText(DetailNode parentNode)
Get joined text from all text children nodes.private static String
getTextFromTag(DetailNode nodeTag)
Get text fromJavadocTokenTypes.JAVADOC_INLINE_TAG
.private static String
getViolationMessages(DetailNode nodeLi)
Get the violation message text for a specific key from the list item.private static boolean
isChildNodeTextMatches(DetailNode ast, Pattern pattern)
Checks whether the first childJavadocTokenType.TEXT
node matches given pattern.private static boolean
isExamplesText(DetailNode ast)
Checks whether the paragraph node corresponds to the example section.private static boolean
isParentText(DetailNode nodeParagraph)
Checks whether theJavadocTokenType.PARAGRAPH
node is referring to the parent javadoc segment.private static boolean
isPropertyList(DetailNode nodeLi)
Checks whether the list item node is part of a property list.private boolean
isTopLevelClassJavadoc()
Check if the current javadoc block comment AST corresponds to the top-level class as we only want to scrape top-level class javadoc.private static boolean
isViolationMessagesText(DetailNode nodeParagraph)
Checks whether theJavadocTokenType.PARAGRAPH
node is referring to the violation message keys javadoc segment.static void
resetModuleDetailsStore()
Reset the module detail store of any previous information.private void
scrapeContent(DetailNode ast)
Method containing the core logic of scraping.void
setWriteXmlOutput(boolean writeXmlOutput)
Setter to control whether to write XML output or not.void
visitJavadocToken(DetailNode ast)
Called to process a Javadoc token.-
Methods inherited from class com.puppycrawl.tools.checkstyle.checks.javadoc.AbstractJavadocCheck
acceptJavadocWithNonTightHtml, beginTree, destroy, finishTree, getAcceptableJavadocTokens, getAcceptableTokens, getBlockCommentAst, getDefaultTokens, getRequiredTokens, init, isCommentNodesRequired, leaveJavadocToken, setJavadocTokens, setViolateExecutionOnNonTightHtml, visitToken
-
Methods inherited from class com.puppycrawl.tools.checkstyle.api.AbstractCheck
clearViolations, getFileContents, getFilePath, getLine, getLineCodePoints, getLines, getTabWidth, getTokenNames, getViolations, leaveToken, log, log, log, setFileContents, setTabWidth, setTokens
-
Methods inherited from class com.puppycrawl.tools.checkstyle.api.AbstractViolationReporter
finishLocalSetup, getCustomMessages, getId, getMessageBundle, getSeverity, getSeverityLevel, setId, setSeverity
-
Methods inherited from class com.puppycrawl.tools.checkstyle.AbstractAutomaticBean
configure, contextualize, getConfiguration, setupChild
-
-
-
-
Field Detail
-
MSG_DESC_MISSING
public static final String MSG_DESC_MISSING
A key is pointing to the warning message text in "messages.properties" file.- See Also:
- Constant Field Values
-
MODULE_DETAILS_STORE
private static final Map<String,ModuleDetails> MODULE_DETAILS_STORE
Module details store used for testing.
-
PROPERTY_TAG
private static final Pattern PROPERTY_TAG
Regular expression for property location in class-level javadocs.
-
TYPE_TAG
private static final Pattern TYPE_TAG
Regular expression for property type location in class-level javadocs.
-
VALIDATION_TYPE_TAG
private static final Pattern VALIDATION_TYPE_TAG
Regular expression for property validation type location in class-level javadocs.
-
DEFAULT_VALUE_TAG
private static final Pattern DEFAULT_VALUE_TAG
Regular expression for property default value location in class-level javadocs.
-
EXAMPLES_TAG
private static final Pattern EXAMPLES_TAG
Regular expression for check example location in class-level javadocs.
-
PARENT_TAG
private static final Pattern PARENT_TAG
Regular expression for module parent location in class-level javadocs.
-
VIOLATION_MESSAGES_TAG
private static final Pattern VIOLATION_MESSAGES_TAG
Regular expression for module violation messages location in class-level javadocs.
-
TOKEN_TEXT_PATTERN
private static final Pattern TOKEN_TEXT_PATTERN
Regular expression for detecting ANTLR tokens(for e.g. CLASS_DEF).
-
DESC_CLEAN
private static final Pattern DESC_CLEAN
Regular expression for removal of @code{-} present at the beginning of texts.
-
FILE_SEPARATOR_PATTERN
private static final Pattern FILE_SEPARATOR_PATTERN
Regular expression for file separator corresponding to the host OS.
-
QUOTE_PATTERN
private static final Pattern QUOTE_PATTERN
Regular expression for quotes.
-
JAVA_FILE_EXTENSION
private static final String JAVA_FILE_EXTENSION
Java file extension.- See Also:
- Constant Field Values
-
PROPERTIES_TO_NOT_WRITE
private static final Set<String> PROPERTIES_TO_NOT_WRITE
This set contains faulty property default value which should not be written to the XML metadata files.
-
PROP_TYPE_MISSING
private static final String PROP_TYPE_MISSING
Format for exception message for missing type for check property.- See Also:
- Constant Field Values
-
PROP_DEFAULT_VALUE_MISSING
private static final String PROP_DEFAULT_VALUE_MISSING
Format for exception message for missing default value for check property.- See Also:
- Constant Field Values
-
moduleDetails
private ModuleDetails moduleDetails
ModuleDetails instance for each module AST traversal.
-
scrapingViolationMessageList
private boolean scrapingViolationMessageList
Boolean variable which lets us know whether violation message section is being scraped currently.
-
toScan
private boolean toScan
Boolean variable which lets us know whether we should scan and scrape the current javadoc or not. Since we need only class level javadoc, it becomes true at its root and false after encounteringJavadocTokenTypes.SINCE_LITERAL
.
-
rootNode
private DetailNode rootNode
DetailNode pointing to the root node of the class level javadoc of the class.
-
propertySectionStartIdx
private int propertySectionStartIdx
Child number of the property section node, where parent is the class level javadoc root node.
-
exampleSectionStartIdx
private int exampleSectionStartIdx
Child number of the example section node, where parent is the class level javadoc root node.
-
parentSectionStartIdx
private int parentSectionStartIdx
Child number of the parent section node, where parent is the class level javadoc root node.
-
writeXmlOutput
private boolean writeXmlOutput
Control whether to write XML output or not.
-
-
Constructor Detail
-
JavadocMetadataScraper
public JavadocMetadataScraper()
-
-
Method Detail
-
setWriteXmlOutput
public final void setWriteXmlOutput(boolean writeXmlOutput)
Setter to control whether to write XML output or not.- Parameters:
writeXmlOutput
- whether to write XML output or not.
-
getDefaultJavadocTokens
public int[] getDefaultJavadocTokens()
Description copied from class:AbstractJavadocCheck
Returns the default javadoc token types a check is interested in.- Specified by:
getDefaultJavadocTokens
in classAbstractJavadocCheck
- Returns:
- the default javadoc token types
- See Also:
JavadocTokenTypes
-
getRequiredJavadocTokens
public int[] getRequiredJavadocTokens()
Description copied from class:AbstractJavadocCheck
The javadoc tokens that this check must be registered for.- Overrides:
getRequiredJavadocTokens
in classAbstractJavadocCheck
- Returns:
- the javadoc token set this must be registered for.
- See Also:
JavadocTokenTypes
-
beginJavadocTree
public void beginJavadocTree(DetailNode rootAst)
Description copied from class:AbstractJavadocCheck
Called before the starting to process a tree.- Overrides:
beginJavadocTree
in classAbstractJavadocCheck
- Parameters:
rootAst
- the root of the tree
-
visitJavadocToken
public void visitJavadocToken(DetailNode ast)
Description copied from class:AbstractJavadocCheck
Called to process a Javadoc token.- Specified by:
visitJavadocToken
in classAbstractJavadocCheck
- Parameters:
ast
- the token to process
-
finishJavadocTree
public void finishJavadocTree(DetailNode rootAst)
Description copied from class:AbstractJavadocCheck
Called after finished processing a tree.- Overrides:
finishJavadocTree
in classAbstractJavadocCheck
- Parameters:
rootAst
- the root of the tree
-
scrapeContent
private void scrapeContent(DetailNode ast)
Method containing the core logic of scraping. This keeps track and decides which phase of scraping we are in, and accordingly call other subroutines.- Parameters:
ast
- javadoc ast
-
createProperties
private static ModulePropertyDetails createProperties(DetailNode nodeLi)
Create the modulePropertyDetails content.- Parameters:
nodeLi
- list item javadoc node- Returns:
- modulePropertyDetail object for the corresponding property
-
getTagTextFromProperty
private static String getTagTextFromProperty(DetailNode nodeLi, DetailNode propertyMeta)
Get tag text from property data.- Parameters:
nodeLi
- javadoc li item nodepropertyMeta
- property javadoc node- Returns:
- property metadata text
-
cleanDefaultTokensText
private static String cleanDefaultTokensText(String initialText)
Clean up the default token text by removing hyperlinks, and only keeping token type text.- Parameters:
initialText
- unclean text- Returns:
- clean text
-
constructSubTreeText
private static String constructSubTreeText(DetailNode node, int childLeftLimit, int childRightLimit)
Performs a DFS of the subtree with a node as the root and constructs the text of that tree, ignoring JavadocToken texts.- Parameters:
node
- root node of subtreechildLeftLimit
- the left index of root children from where to scanchildRightLimit
- the right index of root children till where to scan- Returns:
- constructed text of subtree
-
getDescriptionText
private String getDescriptionText()
Create the description text with starting index as 0 and ending index would be the first valid non-zero index amongst in the order ofpropertySectionStartIdx
,exampleSectionStartIdx
andparentSectionStartIdx
.- Returns:
- description text
-
getPropertyDefaultText
private static String getPropertyDefaultText(DetailNode nodeLi, DetailNode defaultValueNode)
Create property default text, which is either normal property value or list of tokens.- Parameters:
nodeLi
- list item javadoc nodedefaultValueNode
- default value node- Returns:
- default property text
-
getViolationMessages
private static String getViolationMessages(DetailNode nodeLi)
Get the violation message text for a specific key from the list item.- Parameters:
nodeLi
- list item javadoc node- Returns:
- violation message key text
-
getTextFromTag
private static String getTextFromTag(DetailNode nodeTag)
Get text fromJavadocTokenTypes.JAVADOC_INLINE_TAG
.- Parameters:
nodeTag
- target javadoc tag- Returns:
- text contained by the tag
-
getFirstChildOfType
private static Optional<DetailNode> getFirstChildOfType(DetailNode node, int tokenType, int offset)
Returns the first child node which matches the providedTokenType
and has the children index after the offset value.- Parameters:
node
- parent nodetokenType
- token type to matchoffset
- children array index offset- Returns:
- the first child satisfying the conditions
-
getText
private static String getText(DetailNode parentNode)
Get joined text from all text children nodes.- Parameters:
parentNode
- parent node- Returns:
- the joined text of node
-
getFirstChildOfMatchingText
private static Optional<DetailNode> getFirstChildOfMatchingText(DetailNode node, Pattern pattern)
Get first child of parent node matching the provided pattern.- Parameters:
node
- parent nodepattern
- pattern to match against- Returns:
- the first child node matching the condition
-
getParent
private static DetailAST getParent(DetailAST commentBlock)
Returns parent node, removing modifier/annotation nodes.- Parameters:
commentBlock
- child node.- Returns:
- parent node.
-
getParentIndexOf
private static int getParentIndexOf(DetailNode node)
Traverse parents until we reach the root node (@code{JavadocTokenTypes.JAVADOC}) child and return its index.- Parameters:
node
- subtree child node- Returns:
- root node child index
-
getParentText
private static String getParentText(DetailNode nodeParagraph)
Get module parent text from paragraph javadoc node.- Parameters:
nodeParagraph
- paragraph javadoc node- Returns:
- parent text
-
getModuleType
private ModuleType getModuleType()
Get module type(check/filter/filefilter) based on file name.- Returns:
- module type
-
getModuleSimpleName
private String getModuleSimpleName()
Extract simple file name from the whole file path name.- Returns:
- simple module name
-
getPackageName
private static String getPackageName(String filePath)
Retrieve package name of module from the absolute file path.- Parameters:
filePath
- absolute file path- Returns:
- package name
-
getModuleDetailsStore
public static Map<String,ModuleDetails> getModuleDetailsStore()
Getter method formoduleDetailsStore
.- Returns:
- map containing module details of supplied checks.
-
resetModuleDetailsStore
public static void resetModuleDetailsStore()
Reset the module detail store of any previous information.
-
isTopLevelClassJavadoc
private boolean isTopLevelClassJavadoc()
Check if the current javadoc block comment AST corresponds to the top-level class as we only want to scrape top-level class javadoc.- Returns:
- true if the current AST corresponds to top level class
-
isExamplesText
private static boolean isExamplesText(DetailNode ast)
Checks whether the paragraph node corresponds to the example section.- Parameters:
ast
- javadoc paragraph node- Returns:
- true if the section matches the example section marker
-
isPropertyList
private static boolean isPropertyList(DetailNode nodeLi)
Checks whether the list item node is part of a property list.- Parameters:
nodeLi
-JavadocTokenType.LI
node- Returns:
- true if the node is part of a property list
-
isViolationMessagesText
private static boolean isViolationMessagesText(DetailNode nodeParagraph)
Checks whether theJavadocTokenType.PARAGRAPH
node is referring to the violation message keys javadoc segment.- Parameters:
nodeParagraph
- paragraph javadoc node- Returns:
- true if paragraph node contains the violation message keys text
-
isParentText
private static boolean isParentText(DetailNode nodeParagraph)
Checks whether theJavadocTokenType.PARAGRAPH
node is referring to the parent javadoc segment.- Parameters:
nodeParagraph
- paragraph javadoc node- Returns:
- true if paragraph node contains the parent text
-
isChildNodeTextMatches
private static boolean isChildNodeTextMatches(DetailNode ast, Pattern pattern)
Checks whether the first childJavadocTokenType.TEXT
node matches given pattern.- Parameters:
ast
- parent javadoc nodepattern
- pattern to match- Returns:
- true if one of child text nodes matches pattern
-
-