Search code examples
idrisparser-combinatorslightyear

Lightyear requireFailure does not do backtracking


I want to parse a series of any 4 chars. However, these chars shouldn't form a specific string ( "bb" in an example below). So "aaaa" and "abcd" are okay, but neither "bbcd" nor "abbc" should not match.

I composed a following parser:

ntimes 4 (requireFailure (string "bb") *> anyChar)

However, I noticed, that it "eats" single b chars. E.g.

parse (ntimes 4 (requireFailure (string "bb") *> anyToken)) "abcde"

results in ['a', 'c', 'd', 'e'] (it fails, however, on "bbcd" and "abbc" as expected).

As a workaround I used my own implementation of requireFailure:

requireFailure' : Parser a -> Parser ()
requireFailure' p = do
    isP <- p *> pure True <|> pure False
    if isP then fail "argument parser to fail"
           else pure ()

So

parse (ntimes 4 (requireFailure' (string "bb") *> anyToken)) "abcde"

gives ['a', 'b', 'c', 'd'] as I expect.

Apparently lightyear parsers are backtrack-by-default, unless one calls commitTo.

So my question is why library implementation of requireFailure does not do backtracking in case it's argument fails and is it an expected behavior?


Solution

  • If you look at the implementation of requireFailure you can see that it calls the "success" continuation us with the state s it gets after running its argument rather than the one ST i pos tw it got before.

    requireFailure : ParserT str m tok -> ParserT str m ()
    requireFailure (PT f) = PT $ \r, us, cs, ue, ce, (ST i pos tw) =>
                                   f r
                                     (\t, s => ue [Err pos "argument parser to fail"] s)
                                     (\t, s => ce [Err pos "argument parser to fail"] s)
                                     (\errs, s => us () s)
                                     (\errs, s => cs () s)
                                     (ST i pos tw)
    

    The documentation claims that requireFailure is called notFollowedBy in parsec and that doesn't consume any input so you could argue it's a bug on LightYear's side.

    You could open a bug report suggesting to replace the current code with something like (I don't know whether Idris supports @ patterns):

    requireFailure : ParserT str m tok -> ParserT str m ()
    requireFailure (PT f) = PT $ \r, us, cs, ue, ce, s@(ST i pos tw) =>
                                   f r
                                     (\t, s => ue [Err pos "argument parser to fail"] s)
                                     (\t, s => ce [Err pos "argument parser to fail"] s)
                                     (\errs, _ => us () s)
                                     (\errs, _ => cs () s)
                                     s