Search code examples
rubyparsingtokenoption-typeyacc

How to encode a series of optional tokens in YACC (/RACC)


I'm trying to write a RACC parser, a part of which can be represented by the regular expression a[b][c][d].

I came up with the following productions (each token represents the lower case character of the token name):

  expr
    : A
    | A b
    ;

  b
    : B
    | B c
    | c
    ;

  c
    : C
    | C D
    | D

Is this the simplest form, or am I missing something?


Solution

  • If B. C, and D can be distinguished from each other by their first token (which is certainly the case if they are tokens rather than being the simplification of a more complicated grammar), then an alternative is to define non-terminals representing optionality.

    optional_B: /* empty */ | B
    optional_C: /* empty */ | C
    optional_D: /* empty */ | D 
    expression: A optional_B optional_C optional_D
    

    This gets more complicated when the optional subexpressions cannot be immediately distinguished, because the parser needs to correctly identify that the optional non-terminal matches the empty string based only on the following token. But it sounds like that's not the case with your grammar.