I am new to javacc. I am trying to define a token which can match any string. I am following the regex syntax <ANY: (~[])+>
which is not working. I want to achieve something very simple, define an expression having the following BNF:
<exp> ::= "path(" <string> "," <number> ")"
My current .jj file is as follows, any help on how I can parse the string:
options
{
}
PARSER_BEGIN(SimpleAdd)
package SimpleAddTest;
public class SimpleAdd
{
}
PARSER_END(SimpleAdd)
SKIP :
{
" "
| "\r"
| "\t"
| "\n"
}
TOKEN:
{
< NUMBER: (["0"-"9"])+ > |
<PATH: "path"> |
<RPAR: "("> |
<LPAR: ")"> |
<QUOTE: "'"> |
<COMMA: ","> |
<ANY: (~[])+>
}
int expr():
{
String leftValue ;
int rightValue ;
}
{
<PATH> <RPAR> <QUOTE> leftValue = str() <QUOTE> <COMMA> rightValue = num() <LPAR>
{ return 0; }
}
String str():
{
Token t;
}
{
t = <ANY> { return t.toString(); }
}
int num():
{
Token t;
}
{
t = <NUMBER> { return Integer.parseInt(t.toString()); }
}
The error I am getting with the above javacc file is:
Exception in thread "main" SimpleAddTest.ParseException: Encountered " <ANY> "path(\'5\',1) "" at line 1, column 1.
Was expecting:
"path" ...
The pattern <ANY: (~[])+>
will indeed match any nonempty string. The issue is that this is not what you really want. If you have a rule <ANY: (~[])+>
, it will match the whole file, unless the file is empty. In most cases, because of the longest match rule, the whole file will be parsed as [ANY, EOF]
. Is that really what you want? Probably not.
So I'm going to guess at what you really want. I'll guess you want any string that doesn't include a double quote character. Maybe there are other restrictions, such as no nonprinting characters. Maybe you want to allow double quotes if the are preceded by a backslash. Who knows? Adjust as needed.
Here is what you can do. First, replace the token definitions with
TOKEN:
{
< NUMBER: (["0"-"9"])+ > |
<PATH: "path"> |
<RPAR: "("> |
<LPAR: ")"> |
<COMMA: ","> |
<STRING: "\"" (~["\""])* "\"" >
}
Then change your grammar to
int expr():
{
String leftValue ;
int rightValue ;
}
{
<PATH> <RPAR> leftValue=str() <COMMA> rightValue = num() <LPAR>
{ return 0; }
}
String str():
{
Token t;
int len ;
}
{
t = <String>
{ len = t.image.length() ; }
{ return t.image.substring(1,len-1); }
}