Search code examples
bnfjavacc

How to exclude character " in a token JavaCC


Hello i´m working with JavaCC and I am writing a token that put one String between " ". Context:

void literalString(): {} { """ (characteresString())? """ }
void characteresString(): {} { <characterString> | characteresString() <characterString> }

So i made this token to put one String:

TOKEN : {<characterString : ~["\", "] >}

The problem is I don´t know how to exclude the " symbol in the token, if I put """ it gives me error, if i put one " error again.

Thank you in advance


Solution

  • Instead of

    void literalString(): {} { """ (characteresString())? """ }
    

    use a token definition

    TOKEN : { <STRING : "\"" (<CHAR>)* "\"" >
            | <#CHAR : ~["\""] > // Any character that is not "
    }
    

    Now this defines a string to be a ", followed by zero or more characters that are not "s, followed by another ".

    However some languages have further restrictions, such as only allowing characters in a certain range. For example if only printable ascii characters excluding "s where allowed, then you would use

    TOKEN : { <STRING : "\"" (<CHAR>)* "\"" >
            | <#CHAR: [" ","!","#"-"~"]> // Printable ASCII characters excluding "
    }
    

    But, say you want to allow " characters if the are preceded a by \ and you want to ban \ characters unless they are followed by a " or another \ or an n. Then you could use

    TOKEN : { <STRING : "\"" (<CHAR> | <ESCAPESEQ>)* "\"" >
            | <#CHAR: [" ","!","#"-"[","]"-"~"] > // Printable ASCII characters excluding \ and "
            | <#ESCAPESEQ: "\\" ["\"","\\","n"] > // 2-character sequences \\, \", and \n
    }