Search code examples
scalaparboiled2

Parboiled2: How to process dependent fields?


I'm trying to parse a file format, using the excellent parboiled2 library, in which the presence of some fields is dependent upon the value of one or more fields already processed.

For example, say I have two fields, the first of which is a flag indicating whether the second is present. That is, if the first field is true, then the second field (which is an integer value, in this example) is present and must be processed - but if it's false, then the second field isn't present at all. Note that this second field isn't optional - it either must be processed (if the first field is true) or must not be processed (if the first field is false).

So, if a third field (which we'll assume is always present) is a quoted string, both of the following lines are valid:

true 52 "Some quoted string"
false "Some other quoted string"

But this would be invalid:

false 25 "Yet another quoted string"

Ignoring the third field, how do I write a rule to parse the first two? (I can't tell from the documentation, and Googling hasn't helped so far...)

UPDATE: I should clarify that I can't use rules like the following, because the format I'm parsing is actually a lot more complicated than my example:

import org.parboiled2._

class MyParser(override val input: ParserInput)
extends Parser {

  def ws = // whitepsace rule, puts nothing on the stack.

  def intField = // parse integer field, pushes Int onto stack...

  def dependentFields = rule {
    ("true" ~ ws ~ intField) | "false" ~> //etc.
  }
}

UPDATE 2: I've revised the following to make my intent clearer:

What I'm looking for is a valid equivalent to the following (non-existent) rule that performs a match only if a condition is satisfied:

import org.parboiled2._

class MyParser(input: ParserInput)
extends Parser {

  def ws = // whitepsace rule, puts nothing on the stack.

  def intField = // parse integer field, pushes Int onto stack...

  def boolField = // parse boolean field, pushes Boolean onto stack...

  def dependentFields = rule {
    boolField ~> {b =>

      // Match "ws ~ intField" only if b is true. If match succeeds, push Some(Int); if match
      // fails, the rule fails. If b is false, pushes None without attempting the match.
      conditional(b, ws ~ intField)
    }
  }
}

That is, ws ~ intField is only matched if boolField results in a true value. Is something like this possible?


Solution

  • Yes, you can implement such a function with the help of test parser action:

    def conditional[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
      test(bool) ~ parse() ~> (Some(_)) | push(None)
    }
    

    According to the Meta-Rules section of the documentation, it can work only by passing a function to produce rules. You'd have to define dependentFields rule as follows:

    def dependentFields = rule {
      boolField ~> (conditional(_, () => rule { ws ~ intField }))
    }
    

    Update:

    While test(pred) ~ opt1 | opt2 is a common technique, it does backtrack and tries to apply opt2, if test is successful test, but opt1 fails. Here are two possible solutions to prevent such backtracking.

    You can use ~!~ rule combinator, that has "cut" semantics and prohibits backtracking over itself:

    def conditional2[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
      test(bool) ~!~ parse() ~> (Some(_)) | push(None)
    }
    

    Or you actually use if outside of a rule to check the boolean argument and return one of two possible rules:

    def conditional3[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] =
      if (bool) rule { parse() ~> (Some(_: U)) } 
      else rule { push(None) }