Search code examples
parsinghaskellmegaparsec

avoid parsing last separator with `sepBy`


I'm trying to parse a string using megaparsec.

Part of it is a repetition of strings separated by a separator and I'm using sepBy for this. Consider for example

sepBy (char 'a') (char 's')

This parses correctly "", "a", "asa", ... The problem appears if I need to continue parsing with another parser which starts with my separator, as in

(,) <$> sepBy (char 'a') (char 's') <*> string "something"

If I try to parse the string "asasomething" with this parser I'd expect to get ("aa", "something"). Instead I get an error because I don't have an a after the second s.

I tried also with sepEndBy but the result is the same


Solution

  • I solved it as follows.

    The implementation of sepBy used by megapersec is

    sepBy :: MonadPlus m => m a -> m sep -> m [a]
    sepBy p sep = do
      r <- C.optional p
      case r of
        Nothing -> return []
        Just  x -> (x:) <$> many (sep >> p)
    

    I modified it to

    sepBy :: Parser a -> Parser sep -> Parser [a]
    sepBy p sep = do
      r <- optional p
      case r of
        Nothing -> return []
        Just  x -> (x:) <$> many (try $ sep >> p)
    

    to specialise it to Parsec add a try to avoid eager parsing