Search code examples
javastringtokenizer

Find Comments with StringTokenizer


I used the following code to count the number of comments in a code:

StringTokenizer stringTokenizer = new StringTokenizer(str);
int x = 0;

while (stringTokenizer.hasMoreTokens()) {
    if (exists == false && stringTokenizer.nextToken().contains("/*")) {
        exists = true;

    } else if (exists == true && stringTokenizer.nextToken().contains("*/")) {

        x++;
        exists = false;

    }
}

System.out.println(x);

It works if comments have spaces:

e.g.: "/* fgdfgfgf */ /* fgdfgfgf */ /* fgdfgfgf */".

But it does not work for comments without spaces:

e.g.: "/*fgdfgfgf *//* fgdfgfgf*//* fgdfgfgf */".


Solution

  • new StringTokenizer(str,"\n") tokenizes/splits str into lines rather than using the default delimiter which is \t\n\r\f, a combination of spaces, tabs, formfeed, carriage and newline

    StringTokenizer stringTokenizer = new StringTokenizer(str,"\n");
    

    This specifies newline as the only delimiter to use for Tokenizing

    Using your current approach:

    String line;
    while(stringTokenizer.hasMoreTokens()){
    
     line=stringTokenizer.nextToken();
    
       if(!exists && line.contains("/*")){
            exists = true;
       }
       if(exists && line.contains("*/")){
            x++;
            exists = false;
     }    
    }
    

    For multiple comments I tried to use /\\* & \\*/ as patterns in split() and got length for their occurrence in the string, but unfortunately length were not exact due to uneven splitting.

    Multiple/Single Comments can be: (IMO)

    COMMENT=/* .* */
    A = COMMENT;
    B = CODE;
    C = AB/BA/ABA/BAB/AAB/BAA/A;