Search code examples
ocamlocamlyacc

ocamlyacc with empty string


So I have a grammar that includes the empty string. The grammar is something like this:

S->ε

S->expression ;; S

I'm getting the error "No more states to discard" when I run my parser so I believe I'm not representing the empty string correctly. So how would I go about representing it, specifically in the lexer .mll file?

I know I need to make a rule for it so I think I have that down. This is what I think it should look like for the parser .mly file, excluding the stuff for expression.

s:
| EMPTY_STRING            { [] }
| expression SEMICOLON s  { $1::$3 }

Solution

  • You're thinking of epsilon as a token, but it's not a token. It's a 0-length sequence of tokens. Since there are no tokens there, it's not something your scanner needs to know about. Just the parser needs to know about it.

    Here's a grammar something like what I think you want:

    %token X
    %token SEMICOLON
    %token EOF
    %start main
    %type <char list> main
    %%
    main :
      s EOF           { $1 }
    
    s :
      | epsilon         { $1 }
      | X SEMICOLON s { 'x' :: $3 }
    
    epsilon :
                      { [] }
    

    Note that epsilon is a non-terminal (not a token). Its definition is an empty sequence of symbols.