I want to parse a series of any 4 chars. However, these chars shouldn't form a specific string ( "bb"
in an example below). So "aaaa"
and "abcd"
are okay, but neither "bbcd"
nor "abbc"
should not match.
I composed a following parser:
ntimes 4 (requireFailure (string "bb") *> anyChar)
However, I noticed, that it "eats" single b
chars. E.g.
parse (ntimes 4 (requireFailure (string "bb") *> anyToken)) "abcde"
results in ['a', 'c', 'd', 'e']
(it fails, however, on "bbcd"
and "abbc"
as expected).
As a workaround I used my own implementation of requireFailure
:
requireFailure' : Parser a -> Parser ()
requireFailure' p = do
isP <- p *> pure True <|> pure False
if isP then fail "argument parser to fail"
else pure ()
So
parse (ntimes 4 (requireFailure' (string "bb") *> anyToken)) "abcde"
gives ['a', 'b', 'c', 'd']
as I expect.
Apparently lightyear parsers are backtrack-by-default, unless one calls commitTo
.
So my question is why library implementation of requireFailure
does not do backtracking in case it's argument fails and is it an expected behavior?
If you look at the implementation of requireFailure
you can see that it calls the "success" continuation us
with the state s
it gets after running its argument rather than the one ST i pos tw
it got before.
requireFailure : ParserT str m tok -> ParserT str m ()
requireFailure (PT f) = PT $ \r, us, cs, ue, ce, (ST i pos tw) =>
f r
(\t, s => ue [Err pos "argument parser to fail"] s)
(\t, s => ce [Err pos "argument parser to fail"] s)
(\errs, s => us () s)
(\errs, s => cs () s)
(ST i pos tw)
The documentation claims that requireFailure
is called notFollowedBy
in parsec and that doesn't consume any input so you could argue it's a bug on LightYear's side.
You could open a bug report suggesting to replace the current code with something like (I don't know whether Idris supports @
patterns):
requireFailure : ParserT str m tok -> ParserT str m ()
requireFailure (PT f) = PT $ \r, us, cs, ue, ce, s@(ST i pos tw) =>
f r
(\t, s => ue [Err pos "argument parser to fail"] s)
(\t, s => ce [Err pos "argument parser to fail"] s)
(\errs, _ => us () s)
(\errs, _ => cs () s)
s