Search code examples
haskellmonad-transformersparser-combinatorsattoparsecmegaparsec

Can I easily wrap attoparsec in transformer?


I want to write code doing something like C preprocessing. So I looked for libraries and got two candidates, attoparsec, megaparsec.

I need the feature reporting error position and megaparsec already has that. But attoparsec would be desirable for performance.

If I add the error position feature to attoparsec's Parser monad, then should I have to wrap it up in StateT transformer and lift all that library's function when I use them? I think it's tiresome work. Is there any better method?

EDIT

I will adopt megaparsec which is appropriate to this situation. But I still want to know how can I wrap attoparsec's Parser monad. Is there anyone could tell me whether the method I mentioned above is the best one or not?

I want to know just the monad wrapping method. In other words, whether lifting all inner monad function is the only solution or not.


Solution

  • You can get the current parse position from attoparsec, without needing a transformer. But there is no exported function to do it; you have to define it yourself:

    import qualified Data.Attoparsec.Internal.Types as T
    
    offset :: T.Parser i T.Pos
    offset = T.Parser $ \t pos more lose succ -> succ t pos more pos
    

    Example usage:

    λ> parseOnly (many' (skipMany (word8 46) *> offset <* anyWord8)) ".a..a...a....a"
    Right [Pos {fromPos = 1},Pos {fromPos = 4},Pos {fromPos = 8},Pos {fromPos = 13}]
    

    This works as expected for incremental input, too. It only gives you the offset into the input, not (line, column), but the offset is sufficient for many applications.

    Use fromPos to get the Int from a Pos:

    λ> T.fromPos <$> parseOnly offset ""
    Right 0
    

    Now, we can use offset to create a parser that reports the current offset when it fails.

    reportOffsetOnError :: T.Parser i a -> T.Parser i a
    reportOffsetOnError p =
      p <|> (offset >>= \pos ->
        fail ("failed at offset: " ++ show (T.fromPos pos)))
    

    Example usage:

    λ> parseOnly (word8 46 *> word8 46 *> reportOffsetOnError (word8 97)) "..a"
    Right 97
    λ> parseOnly (word8 46 *> word8 46 *> reportOffsetOnError (word8 97)) "..b"
    Left "Failed reading: failed at offset: 2"
    

    A final note: Data.Attoparsec.Zepto does provide the ZeptoT transformer if you really need a transformer and want to stay with the attoparsec package, but this is a different parser type from the main parser in attoparsec.