Search code examples
haskellparsec

Parsec: grabbing raw source after parsing


I have a strange whim. Suppose I have something like this:

data Statement = StatementType Stuff Source

Now I want to parse such a statement, parse all the stuff, and after that I want to put all characters that I've processed (for this particular statement) into resulting data structure. For some reason.

Is it possible, and if yes, how to accomplish that?


Solution

  • In general this is not possible. parsec does not expect a lot from its stream type, in particular there is no way to efficently split a stream.

    But for a concrete stream type (e.g. String, or [a], or ByteString) a hack like this would work:

    parseWithSource :: Parsec [c] u a -> Parsec [c] u ([c], a)
    parseWithSource p = do
        input <- getInput
        a <- p
        input' <- getInput
        return (take (length input - length input') input, a)
    

    This solution relies on function getInput that returns current input. So we can get the input twice: before and after parsing, this gives us exact number of consumed elements, and knowing that we can take these elements from the original input.

    Here you can see it in action:

    *Main Text.Parsec> parseTest (between (char 'x') (char 'x') (parseWithSource ((read :: String -> Int) `fmap` many1 digit))) "x1234x"
    ("1234",1234)
    

    But you should also look into attoparsec, as it properly supports this functionality with the match function.