Search code examples
scalaparser-combinators

How to get this simple regex-parser to catch a Boolean expression?


I am trying to understand parser combinators with Scala, and have written the following:

import scala.util.parsing.combinator._

class SimpleParser extends RegexParsers {

  def operand: Parser[String] = "[A-Za-z]+".r ^^ {_.toString}

  def operator: Parser[String] = "OR|AND".r ^^ {_.toString}

  def term: Parser[String] = (("(" ~> (operand ~ operator ~ operand) <~ ")")) ^^ {
    case o1 ~ operator ~ o2 => o1 + " " + operator + " " + o2
    case _ => " "
  }


  def expression: Parser[String] = (operand | term | (operand ~ operator ~ term))^^ {
    case str:String => str
    case operand ~ operator ~ term => operand + " " + operator + " " + term
  }
}

object ParserTest extends SimpleParser{
  def main(args: Array[String]): Unit = {
     println(parseAll(expression, "A").get)
     println(parseAll(expression, "(A OR C)").get)
     println(parseAll(expression, "A AND (A OR C)").get)
  }
}

The first two prints works find, while the last one causes:

Exception in thread "main" java.lang.RuntimeException: No result when parsing failed at scala.sys.package$.error(package.scala:27)
at scala.util.parsing.combinator.Parsers$NoSuccess.get(Parsers.scala:181)
at scala.util.parsing.combinator.Parsers$NoSuccess.get(Parsers.scala:167)
at ParserTest$.main(ParserTest.scala:31)
at ParserTest.main(ParserTest.scala)

I thought the last sentence would match the (operand ~ operator ~ term) pattern in "expression". Can someone explain me why my pattern is wrong, and maybe show the write one to match the last print statement?


Solution

  • First, you're not properly handling the result of parseAll. If you were, you'd see that on the last example, it was returning a Failure with the message

    [1.3] failure: end of input expected
    
    A AND (B OR C)
      ^
    

    The problem here is you have your parsers inside expression in the wrong order.

    When creating a disjunction of parsers (uisng |), you always have to start with the "greediest" parser. In other words, what's happening here is that operand by itself is succeeding to parse the "A" and parsing ends. However, parseAll sees that parsing has succeeded but there is still input left, so it returns the above error.

    If you reverse the order of the 3 parsers, so it looks like:

    def expression: Parser[String] = ((operand ~ operator ~ term) | term | operand)^^
    

    they are now properly ordered and all 3 examples work.