Class JavadocMetadataScraper

All Implemented Interfaces:
Configurable, Contextualizable

Class for scraping module metadata from the corresponding class' class-level javadoc.
  • Field Details

    • MSG_DESC_MISSING

      public static final String MSG_DESC_MISSING
      A key is pointing to the warning message text in "messages.properties" file.
      See Also:
    • MODULE_DETAILS_STORE

      Module details store used for testing.
    • PROPERTY_TAG

      private static final Pattern PROPERTY_TAG
      Regular expression for property location in class-level javadocs.
    • TYPE_TAG

      private static final Pattern TYPE_TAG
      Regular expression for property type location in class-level javadocs.
    • VALIDATION_TYPE_TAG

      private static final Pattern VALIDATION_TYPE_TAG
      Regular expression for property validation type location in class-level javadocs.
    • DEFAULT_VALUE_TAG

      private static final Pattern DEFAULT_VALUE_TAG
      Regular expression for property default value location in class-level javadocs.
    • EXAMPLES_TAG

      private static final Pattern EXAMPLES_TAG
      Regular expression for check example location in class-level javadocs.
    • PARENT_TAG

      private static final Pattern PARENT_TAG
      Regular expression for module parent location in class-level javadocs.
    • VIOLATION_MESSAGES_TAG

      private static final Pattern VIOLATION_MESSAGES_TAG
      Regular expression for module violation messages location in class-level javadocs.
    • TOKEN_TEXT_PATTERN

      private static final Pattern TOKEN_TEXT_PATTERN
      Regular expression for detecting ANTLR tokens(for e.g. CLASS_DEF).
    • DESC_CLEAN

      private static final Pattern DESC_CLEAN
      Regular expression for removal of @code{-} present at the beginning of texts.
    • FILE_SEPARATOR_PATTERN

      private static final Pattern FILE_SEPARATOR_PATTERN
      Regular expression for file separator corresponding to the host OS.
    • QUOTE_PATTERN

      private static final Pattern QUOTE_PATTERN
      Regular expression for quotes.
    • JAVA_FILE_EXTENSION

      private static final String JAVA_FILE_EXTENSION
      Java file extension.
      See Also:
    • PROPERTIES_TO_NOT_WRITE

      private static final Set<String> PROPERTIES_TO_NOT_WRITE
      This set contains faulty property default value which should not be written to the XML metadata files.
    • PROP_TYPE_MISSING

      private static final String PROP_TYPE_MISSING
      Format for exception message for missing type for check property.
      See Also:
    • PROP_DEFAULT_VALUE_MISSING

      private static final String PROP_DEFAULT_VALUE_MISSING
      Format for exception message for missing default value for check property.
      See Also:
    • moduleDetails

      ModuleDetails instance for each module AST traversal.
    • scrapingViolationMessageList

      Boolean variable which lets us know whether violation message section is being scraped currently.
    • toScan

      private boolean toScan
      Boolean variable which lets us know whether we should scan and scrape the current javadoc or not. Since we need only class level javadoc, it becomes true at its root and false after encountering JavadocTokenTypes.SINCE_LITERAL.
    • rootNode

      DetailNode pointing to the root node of the class level javadoc of the class.
    • propertySectionStartIdx

      Child number of the property section node, where parent is the class level javadoc root node.
    • exampleSectionStartIdx

      Child number of the example section node, where parent is the class level javadoc root node.
    • parentSectionStartIdx

      private int parentSectionStartIdx
      Child number of the parent section node, where parent is the class level javadoc root node.
    • writeXmlOutput

      private boolean writeXmlOutput
      Control whether to write XML output or not.
  • Constructor Details

  • Method Details

    • setWriteXmlOutput

      public final void setWriteXmlOutput(boolean writeXmlOutput)
      Setter to control whether to write XML output or not.
      Parameters:
      writeXmlOutput - whether to write XML output or not.
    • getDefaultJavadocTokens

      public int[] getDefaultJavadocTokens()
      Description copied from class: AbstractJavadocCheck
      Returns the default javadoc token types a check is interested in.
      Specified by:
      getDefaultJavadocTokens in class AbstractJavadocCheck
      Returns:
      the default javadoc token types
      See Also:
    • getRequiredJavadocTokens

      public int[] getRequiredJavadocTokens()
      Description copied from class: AbstractJavadocCheck
      The javadoc tokens that this check must be registered for.
      Overrides:
      getRequiredJavadocTokens in class AbstractJavadocCheck
      Returns:
      the javadoc token set this must be registered for.
      See Also:
    • beginJavadocTree

      public void beginJavadocTree(DetailNode rootAst)
      Description copied from class: AbstractJavadocCheck
      Called before the starting to process a tree.
      Overrides:
      beginJavadocTree in class AbstractJavadocCheck
      Parameters:
      rootAst - the root of the tree
    • visitJavadocToken

      public void visitJavadocToken(DetailNode ast)
      Description copied from class: AbstractJavadocCheck
      Called to process a Javadoc token.
      Specified by:
      visitJavadocToken in class AbstractJavadocCheck
      Parameters:
      ast - the token to process
    • finishJavadocTree

      public void finishJavadocTree(DetailNode rootAst)
      Description copied from class: AbstractJavadocCheck
      Called after finished processing a tree.
      Overrides:
      finishJavadocTree in class AbstractJavadocCheck
      Parameters:
      rootAst - the root of the tree
    • scrapeContent

      private void scrapeContent(DetailNode ast)
      Method containing the core logic of scraping. This keeps track and decides which phase of scraping we are in, and accordingly call other subroutines.
      Parameters:
      ast - javadoc ast
    • createProperties

      Create the modulePropertyDetails content.
      Parameters:
      nodeLi - list item javadoc node
      Returns:
      modulePropertyDetail object for the corresponding property
    • getTagTextFromProperty

      private static String getTagTextFromProperty(DetailNode nodeLi, DetailNode propertyMeta)
      Get tag text from property data.
      Parameters:
      nodeLi - javadoc li item node
      propertyMeta - property javadoc node
      Returns:
      property metadata text
    • cleanDefaultTokensText

      private static String cleanDefaultTokensText(String initialText)
      Clean up the default token text by removing hyperlinks, and only keeping token type text.
      Parameters:
      initialText - unclean text
      Returns:
      clean text
    • constructSubTreeText

      private static String constructSubTreeText(DetailNode node, int childLeftLimit, int childRightLimit)
      Performs a DFS of the subtree with a node as the root and constructs the text of that tree, ignoring JavadocToken texts.
      Parameters:
      node - root node of subtree
      childLeftLimit - the left index of root children from where to scan
      childRightLimit - the right index of root children till where to scan
      Returns:
      constructed text of subtree
    • getDescriptionText

      Create the description text with starting index as 0 and ending index would be the first valid non-zero index amongst in the order of propertySectionStartIdx, exampleSectionStartIdx and parentSectionStartIdx.
      Returns:
      description text
    • getPropertyDefaultText

      private static String getPropertyDefaultText(DetailNode nodeLi, DetailNode defaultValueNode)
      Create property default text, which is either normal property value or list of tokens.
      Parameters:
      nodeLi - list item javadoc node
      defaultValueNode - default value node
      Returns:
      default property text
    • getViolationMessages

      private static String getViolationMessages(DetailNode nodeLi)
      Get the violation message text for a specific key from the list item.
      Parameters:
      nodeLi - list item javadoc node
      Returns:
      violation message key text
    • getTextFromTag

      private static String getTextFromTag(DetailNode nodeTag)
      Get text from JavadocTokenTypes.JAVADOC_INLINE_TAG.
      Parameters:
      nodeTag - target javadoc tag
      Returns:
      text contained by the tag
    • getFirstChildOfType

      private static Optional<DetailNode> getFirstChildOfType(DetailNode node, int tokenType, int offset)
      Returns the first child node which matches the provided TokenType and has the children index after the offset value.
      Parameters:
      node - parent node
      tokenType - token type to match
      offset - children array index offset
      Returns:
      the first child satisfying the conditions
    • getText

      private static String getText(DetailNode parentNode)
      Get joined text from all text children nodes.
      Parameters:
      parentNode - parent node
      Returns:
      the joined text of node
    • getFirstChildOfMatchingText

      Get first child of parent node matching the provided pattern.
      Parameters:
      node - parent node
      pattern - pattern to match against
      Returns:
      the first child node matching the condition
    • getParent

      private static DetailAST getParent(DetailAST commentBlock)
      Returns parent node, removing modifier/annotation nodes.
      Parameters:
      commentBlock - child node.
      Returns:
      parent node.
    • getParentIndexOf

      private static int getParentIndexOf(DetailNode node)
      Traverse parents until we reach the root node (@code{JavadocTokenTypes.JAVADOC}) child and return its index.
      Parameters:
      node - subtree child node
      Returns:
      root node child index
    • getParentText

      private static String getParentText(DetailNode nodeParagraph)
      Get module parent text from paragraph javadoc node.
      Parameters:
      nodeParagraph - paragraph javadoc node
      Returns:
      parent text
    • getModuleType

      Get module type(check/filter/filefilter) based on file name.
      Returns:
      module type
    • getModuleSimpleName

      Extract simple file name from the whole file path name.
      Returns:
      simple module name
    • getPackageName

      private static String getPackageName(String filePath)
      Retrieve package name of module from the absolute file path.
      Parameters:
      filePath - absolute file path
      Returns:
      package name
    • getModuleDetailsStore

      Getter method for moduleDetailsStore.
      Returns:
      map containing module details of supplied checks.
    • resetModuleDetailsStore

      public static void resetModuleDetailsStore()
      Reset the module detail store of any previous information.
    • isTopLevelClassJavadoc

      private boolean isTopLevelClassJavadoc()
      Check if the current javadoc block comment AST corresponds to the top-level class as we only want to scrape top-level class javadoc.
      Returns:
      true if the current AST corresponds to top level class
    • isExamplesText

      private static boolean isExamplesText(DetailNode ast)
      Checks whether the paragraph node corresponds to the example section.
      Parameters:
      ast - javadoc paragraph node
      Returns:
      true if the section matches the example section marker
    • isPropertyList

      private static boolean isPropertyList(DetailNode nodeLi)
      Checks whether the list item node is part of a property list.
      Parameters:
      nodeLi - JavadocTokenType.LI node
      Returns:
      true if the node is part of a property list
    • isViolationMessagesText

      private static boolean isViolationMessagesText(DetailNode nodeParagraph)
      Checks whether the JavadocTokenType.PARAGRAPH node is referring to the violation message keys javadoc segment.
      Parameters:
      nodeParagraph - paragraph javadoc node
      Returns:
      true if paragraph node contains the violation message keys text
    • isParentText

      private static boolean isParentText(DetailNode nodeParagraph)
      Checks whether the JavadocTokenType.PARAGRAPH node is referring to the parent javadoc segment.
      Parameters:
      nodeParagraph - paragraph javadoc node
      Returns:
      true if paragraph node contains the parent text
    • isChildNodeTextMatches

      private static boolean isChildNodeTextMatches(DetailNode ast, Pattern pattern)
      Checks whether the first child JavadocTokenType.TEXT node matches given pattern.
      Parameters:
      ast - parent javadoc node
      pattern - pattern to match
      Returns:
      true if one of child text nodes matches pattern