I'm trying to parse a file format, using the excellent parboiled2 library, in which the presence of some fields is dependent upon the value of one or more fields already processed.
For example, say I have two fields, the first of which is a flag indicating whether the second is present. That is, if the first field is true
, then the second field (which is an integer value, in this example) is present and must be processed - but if it's false
, then the second field isn't present at all. Note that this second field isn't optional - it either must be processed (if the first field is true
) or must not be processed (if the first field is false
).
So, if a third field (which we'll assume is always present) is a quoted string, both of the following lines are valid:
true 52 "Some quoted string"
false "Some other quoted string"
But this would be invalid:
false 25 "Yet another quoted string"
Ignoring the third field, how do I write a rule to parse the first two? (I can't tell from the documentation, and Googling hasn't helped so far...)
UPDATE: I should clarify that I can't use rules like the following, because the format I'm parsing is actually a lot more complicated than my example:
import org.parboiled2._
class MyParser(override val input: ParserInput)
extends Parser {
def ws = // whitepsace rule, puts nothing on the stack.
def intField = // parse integer field, pushes Int onto stack...
def dependentFields = rule {
("true" ~ ws ~ intField) | "false" ~> //etc.
}
}
UPDATE 2: I've revised the following to make my intent clearer:
What I'm looking for is a valid equivalent to the following (non-existent) rule that performs a match only if a condition is satisfied:
import org.parboiled2._
class MyParser(input: ParserInput)
extends Parser {
def ws = // whitepsace rule, puts nothing on the stack.
def intField = // parse integer field, pushes Int onto stack...
def boolField = // parse boolean field, pushes Boolean onto stack...
def dependentFields = rule {
boolField ~> {b =>
// Match "ws ~ intField" only if b is true. If match succeeds, push Some(Int); if match
// fails, the rule fails. If b is false, pushes None without attempting the match.
conditional(b, ws ~ intField)
}
}
}
That is, ws ~ intField
is only matched if boolField
results in a true
value. Is something like this possible?
Yes, you can implement such a function with the help of test
parser action:
def conditional[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
test(bool) ~ parse() ~> (Some(_)) | push(None)
}
According to the Meta-Rules section of the documentation, it can work only by passing a function to produce rules. You'd have to define dependentFields
rule as follows:
def dependentFields = rule {
boolField ~> (conditional(_, () => rule { ws ~ intField }))
}
Update:
While test(pred) ~ opt1 | opt2
is a common technique, it does backtrack and tries to apply opt2
, if test
is successful test
, but opt1
fails. Here are two possible solutions to prevent such backtracking.
You can use ~!~
rule combinator, that has "cut" semantics and prohibits backtracking over itself:
def conditional2[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] = rule {
test(bool) ~!~ parse() ~> (Some(_)) | push(None)
}
Or you actually use if
outside of a rule to check the boolean argument and return one of two possible rules:
def conditional3[U](bool: Boolean, parse: () => Rule1[U]): Rule1[Option[U]] =
if (bool) rule { parse() ~> (Some(_: U)) }
else rule { push(None) }