I have the following grammer.
: '<' NAME '>' TEXT '</' NAME '>'
| '<' NAME S* attribute* '>';
dl : '<' NAME '><' TEXT '>' dt* '</' NAME '><' TEXT '>';
dt : '<' NAME '><' NAME S* attribute* S* '>' TEXT '</' NAME '>';
attribute : attributeName '=' attributeValue;
attributeName : NAME;
attributeValue : VAL;
NAME : [A-Z0-9_-]+;
VAL : '"'.*?'"';
TEXT : [A-Za-z0-9:\/\.@\-;\s*]+;
S : [ \t\r\n]+ -> skip;
The string is
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
<DT><H3 ADD_DATE="1481473849" LAST_MODIFIED="1481473992" PERSONAL_XYZ_FOLDER="true">Foo bar</H3>
I am getting the following error:
ParseError extraneous input 'bar' expecting '</' clj-antlr.common/parse-error (common.clj:146)
The problem is that the space is skipped so when Foo bar
has a space it is giving an error. But if I am not skipping the space I get another error in the META
parsing. (The S*
is not required when skipping spaces).
ParseError extraneous input ' ' expecting {'>', NAME}
mismatched input '>' expecting '><'
mismatched input '<' expecting {<EOF>, COMMENT, S} clj-antlr.common/parse-error (common.clj:146)
Here is my tokens file generated by antlr:
And when I run using grun
I get the following, but I don't see any errors in the token reported. It is similar to the grammar I defined. How can I accept spaces in tag values?
$ grun MyGrammer r -tokens
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8">
[@2,5:5=' ',<11>,1:5]
[@6,31:31=' ',<11>,1:31]
[@9,40:65='"text/html; charset=UTF-8"',<9>,1:40]
No method for rule r or it has arguments
If you put a space between foo
and bar
the lexer produces it as two Tokens (of type TEXT
) but the grammar states that only one name token is allowed. To solve your problem you simply have to allow a few TEXTs in a seqnece via the plus-operator:
dt : '<' NAME '><' NAME S* attribute* S* '>' TEXT+ '</' NAME '>';
Also notice that you might run into problems as the Lexer will convert quite a few inputs in NAMEs and not in TEXTs as they both can match the pattern [A-Z0-9]+