Search code examples
haskellparsec

Haskell parsing using an accumulator with Parsec


Haskell beginner here.

Say that I have a parser that I supply with some information and it returns the results of parsing a section and the info needed for the next section.

readSection :: Info -> Parser (Section, Info)

I was hoping to be able to parse multiple sections using the many combinator, but I can't get the types to work. I need some way of having a parser take in the Info of the previous computation. This structure reminds me of a fold (since the previous results can just be stored in the accumulator), but I'm not sure how to apply that here.

readSections :: Info -> Parser [Section]
readSections startInfo = do
  let parser = readSection startInfo
  -- ???
  return $ many readSection

Maybe there's a better way of doing something like this, so any guidance would be greatly appreciated. Thanks!


Solution

  • If you're looking for an existing combinator, this looks like an application of unfoldrM from monad-loops (among other places). To avoid having to import it, it can be defined as:

    unfoldrM :: Monad m => (a -> m (Maybe (b, a))) -> a -> m [b]
    unfoldrM f a = do
      r <- f a
      case r of
        Nothing -> return []
        Just (b, a') -> (b:) <$> unfoldrM f a'
    

    Basically, it takes an initial seed of type a and uses it to monadically generate values and new seeds. Your seed is your info: you use the info to monadically generate a parsed section plus a new version of the info.

    You just have to stick in an optionMaybe to provide the Just/Nothing switch so unfoldrM knows when it's reached the end of all the sections:

    readSections = unfoldrM (optionMaybe . readSection)
    

    However, as a Haskell beginner, it may be helpful to see how you do this sort of thing from scratch. The many combinator isn't magical. It's basically equivalent to the relatively simple monadic computation:

    many p = (do x <- p
                 xs <- many p
                 return (x:xs))
             <|> return []
    

    So, you can write readSections from scratch as a monadic computation in a similar manner:

    readSections' :: Info -> Parser [Section]
    readSections' i = (do -- try to read a section, getting updated info
                          (s, i') <- readSection i
                          -- read zero or more remaining sections with updated info
                          ss <- readSections' i'
                          -- return the sections
                          return (s:ss))
                      -- if no more sections to read, return an empty list
                      <|> return []