I have the following piece of code:
case 1
of 2
3
of 3
4
5
That my custom tokenizer translates to:
Tokens: [{'case',1},
{integer,1,1},
{eol,1},
{'of',1},
{integer,1,2},
{block,[{integer,1,3}]},
{eol,1},
{'of',1},
{integer,1,3},
{block,[{integer,1,4}]},
{eol,1},
{integer,1,5}]
But then I'm not being able to parse it with the following Yecc:
Nonterminals
grammar
statements statement
case_def case_conditions condition.
Terminals
eol block
integer
case of.
Rootsymbol grammar.
grammar -> statements : '$1'.
statements -> statement eol statements : ['$1'|'$3'].
statements -> statement : ['$1'].
statement -> case_def : '$1'.
statement -> integer : '$1'.
case_def -> 'case' integer case_conditions : ''.
case_conditions -> case_condition case_conditions : ['$1'|'$2'].
case_conditions -> case_condition : ['$1'].
case_condition -> eol 'of' integer block : ''.
It gives me the following output:
["syntax error before: ","5"]
Any help is really welcome, thanks.
I think in nonterminal list, you should have case_condition
instead of condition
.
Your custom scanner ignores the indentation. It has to emit tokens for INDENT
and DEDENT
. I've found example in yacc. Then you can change your grammar to use those tokens.
Your example generates shift/reduce conflict. Documentation says, that:
Shift/reduce conflicts are resolved in favor of shifting if there are no operator precedence declarations.
It means, that when your parser is at the place denoted by |
of 3
4|
5
and sees new line, there might be two options:
case_condition
so parser needs to keep going "shift" to read moreBecause those conflicts are always resolved to shift, you can't put next statement after you started case! You need to change your grammar. Work with it until yecc:file/1
does not generate any warnings.
HINT: Indent everything inside case like this:
case 1
of 2
3
of 3
4
5
This way, you are clearly saying, that case
is one statement and 5
is another. My knowledge about parsers got a little rusty, but I believe, that you wouldn't be able to write grammar that is able to differentiate between case_condition
and statement
with left to right parser, if you don't add indentation or some kind of terminator.