Search code examples
parsingscalacontext-free-grammarparser-combinators

Parser Combinators - Simple grammar


I am trying to use parser combinators in Scala on a simple grammar that I have copied from a book. When I run the following code it stops immediately after the first token has been parsed with the error

[1.3] failure: string matching regex '\z' expected but '+' found

I can see why things go wrong. The first token is an expression and therefor it is the only thing that needs to be parsed according to the grammar. However I do not know what is a good way to fix it.

object SimpleParser extends RegexParsers 
{
    def Name = """[a-zA-Z]+""".r
    def Int = """[0-9]+""".r

    def Main:Parser[Any] = Expr
    def Expr:Parser[Any] = 
    (
          Term
        | Term <~ "+" ~> Expr
        | Term <~ "-" ~> Expr
    )

    def Term:Parser[Any] = 
    (
          Factor
        | Factor <~ "*" ~> Term
    )

    def Factor:Parser[Any] =
    (
          Name
        | Int 
        | "-" ~> Int 
        | "(" ~> Expr <~ ")" 
        | "let" ~> Name <~ "=" ~> Expr <~ "in" ~> Expr <~ "end" 
    )

    def main(args: Array[String]) 
    {
        var input = "2 + 2"
        println(input)
        println(parseAll(Main, input))
    }
}

Solution

  • Factor <~ "*" ~> Term means Factor.<~("*" ~> Term), so the whole right part is dropped. Use Factor ~ "*" ~ Term ^^ { case f ~ _ ~ t => ??? } or rep1sep:

    scala> :paste
    // Entering paste mode (ctrl-D to finish)
    
    import scala.util.parsing.combinator.RegexParsers
    
    object SimpleParser extends RegexParsers
    {
        def Name = """[a-zA-Z]+""".r
        def Int = """[0-9]+""".r
    
        def Main:Parser[Any] = Expr
        def Expr:Parser[Any] = rep1sep(Term, "+" | "-")
    
        def Term:Parser[Any] = rep1sep(Factor, "*")
    
        def Factor:Parser[Any] =
        (
              "let" ~> Name ~ "=" ~ Expr ~ "in" ~ Expr <~ "end" ^^ { case n ~ _ ~ e1 ~ _ ~ e2 => (n, e1, e2)
            | Int
            | "-" ~> Int
            | "(" ~> Expr <~ ")"
            | Name }
        )
    }
    
    SimpleParser.parseAll(SimpleParser.Main, "2 + 2")
    
    // Exiting paste mode, now interpreting.
    
    import scala.util.parsing.combinator.RegexParsers
    defined module SimpleParser
    res1: SimpleParser.ParseResult[Any] = [1.6] parsed: List(List(2), List(2))
    

    The second part of parser def Term:Parser[Any] = Factor | Factor <~ "*" ~> Term is useless. The first part, Factor, can parse (with non-empty next) any Input that the second part, Factor <~ "*" ~> Term, is able to parse.