Search code examples
haskellghcghciparsec

How to use the latest version of the Parsec.Indent library?


It might seem that this question is a duplicate of this question, however either Parsec or the Indent library has changed since 2012 and none of the old examples I have found for the indent library compile with the latest versions.

I want to make a parser for a programming language where indentation is part of the syntax (used to indicate scopes), in order to achieve this I want to make use of the Text.Parsec.Indent library, but I am at a loss on how to use it. It is clear to me that some modifications/custom parser type has to be made, but my limited knowledge on the State monad and surface level understanding of parsec seem to not be enough.

Let's say you wanted to make a parser for a simple list of ints like below. How would one achieve this?

mylist
    fstitem
    snditem

My attempts to create a simple parser based on some of the old examples floating around on the internet looked like this, but it obviously produces some type errors:

import Control.Monad.State

import Text.Parsec hiding (State)
import Text.Parsec.Indent
import Text.Parsec.Pos

type IParser a = ParsecT String () (State SourcePos) a

parseInt :: IParser Integer
parseInt = read <$> many1 digit

parseIndentedInt :: IParser Integer
parseIndentedInt = indented *> parseInt

specifically these:

Frontend/Parser.hs:14:20: error:
    • Couldn't match type ‘Control.Monad.Trans.Reader.ReaderT
                             Text.Parsec.Indent.Internal.Indentation m0’
                     with ‘StateT SourcePos Data.Functor.Identity.Identity’
      Expected type: IParser Integer
        Actual type: ParsecT String () (IndentT m0) Integer
    • In the expression: indented *> parseInt
      In an equation for ‘parseIndentedInt’:
          parseIndentedInt = indented *> parseInt
   |
14 | parseIndentedInt = indented *> parseInt
   |                    ^^^^^^^^^^^^^^^^^^^^

Frontend/Parser.hs:14:32: error:
    • Couldn't match type ‘StateT
                             SourcePos Data.Functor.Identity.Identity’
                     with ‘Control.Monad.Trans.Reader.ReaderT
                             Text.Parsec.Indent.Internal.Indentation m0’
      Expected type: ParsecT String () (IndentT m0) Integer
        Actual type: IParser Integer
    • In the second argument of ‘(*>)’, namely ‘parseInt’
      In the expression: indented *> parseInt
      In an equation for ‘parseIndentedInt’:
          parseIndentedInt = indented *> parseInt
   |
14 | parseIndentedInt = indented *> parseInt
   |                                ^^^^^^^^
Failed, no modules loaded.

Solution

  • Okay after some deep diving into the source code and looking at the tests in the indents GitHub repository I managed to create a working example.

    The following code can parse a simple indented list:

    import Text.Parsec        as Parsec
    import Text.Parsec.Indent as Indent
    
    data ExampleList = ExampleList String [ExampleList] 
                     deriving (Eq, Show)
    
    plistItem :: Indent.IndentParser String () String
    plistItem = Parsec.many1 Parsec.lower <* Parsec.spaces
    
    pList :: Indent.IndentParser String () ExampleList
    pList = Indent.withPos (ExampleList <$> plistItem <*> Parsec.many (Indent.indented *> pList))
    
    useParser :: Indent.IndentParser String () a -> String -> a
    useParser p src = helper res
                    where res = Indent.runIndent $ Parsec.runParserT (p <* Parsec.eof) () "<test>" src
                          helper (Left err) = error "Parse error"
                          helper (Right ok) = ok
    

    example usage:

    *Main> useParser pList "mylist\n\tfstitem\n\tsnditem"
    ExampleList "mylist" [ExampleList "fstitem" [],ExampleList "snditem" []]
    

    Note that the useParser function does some stuff with actually taking the result from the Either monad, as well as putting an end of file parser behind the supplied parser. Depending on your application you might want to change this.

    Additionally the type signatures could be shortend with something like this:

    type IParser a = Indent.IndentParser String () a
    
    plistItem :: IParser String
    pList :: IParser ExampleList
    useParser :: IParser a -> String -> a