Search code examples
scalaparsingbacktrackingfastparse

Fastparse doesn't backtrack


Edit (to generalize the problem):

I'd like to parse a grammar, where

<prefix> ::= [a-z]*
<middle> ::= xxx
<suffix> ::= b+
<grammar> ::= <prefix><middle><suffix>

I expect (for example) the following words to pass: aaaaxxxbb, axxxaaxxxbbb, xxxxxxbb

Original post:

I expected the following parser to backtrack and find a solution in the end:

val before = P(AnyChar.rep.!)
val content = P("xxx".!)
val after = P("b".rep.!)
val all = P(before ~ content ~ after ~ End)
def test() = {
  val r = all.parse("aaaaxxxbbb")
  println(r)
}

Instead it looks like the before part greedily parses all the text, and the parser fails without backtracking.

Am I missing something?


Solution

  • I was able to solve the issue in the end.

    It's reasonable, that the parser won't backtrack within the regex, so I thought I should rewrite the AnyChar.rep part as a recursive rule, like this:

    val before: P[Any] = P(AnyChar | (AnyChar ~ before))
    

    But that wasn't enough, fastparse still don't seem to backtrack.

    I stumbled upon this question about parsing ambiguous grammar. So I tried using GLL combinators instead of Fastparse, and that made it work.

    object TestParser1 extends Parsers with RegexParsers {
    
      lazy val before: Parser[String] = (".".r | (".".r ~ before)) ^^ {
        case a ~ b => a.toString + b.toString
        case x => x.toString
      }
      lazy val content: Parser[String] = "xxx"
      lazy val after: Parser[String] = "b+".r
      lazy val all: Parser[String] = before ~ content ~ after ^^ {
        case (b, c, a) => s"$b $c $a"
      }
    
      def test() = {
        val r = all("aaaaxxxbbb")
        r.toList foreach println
      }
    
    }