Search code examples
haskellparsec

Can my parsec parser write deprecation messages?


I have a DSL, and a parser for it written in Haskell with the Parsec package. Now I want to deprecate a specific language feature of the DSL. In the next release, I want the parser to accept both the new and the old syntax, but I want the parser to spawn a deprecation message. I could not find how to do this. Is this possible, and if so, how can this be done?


Solution

  • Instead of emitting messages during parsing, it would be better to return extra information at the end of parsing: whether or not deprecated syntax was encountered.

    The ParsecT type admits a type parameter for state set by the user during parsing:

    ParsecT s u m a is a parser with stream type s, user state type u, underlying monad m and return type a. Parsec is strict in the user state.

    The user state can be set with putState and modifyState. It can be obtained using getState.

    Most parsec combinators are polymorphic on the user state. Most combinators for your own DSL should be, as well. But parsers for deprecated parts of the syntax should set a "flag" in your user state.

    Something like this:

    import Text.Parsec
    import Text.Parsec.Char
    import Data.Functor.Identity
    
    type Parser = ParsecT [Char] Bool Identity -- using a Bool state
    
    myParser :: Parser Char
    myParser = 
        try (do char 'a' 
                putState True 
                char 'b')
        <|> 
        try (do char 'a' 
                char 'c')
    
    main :: IO ()
    main = do
        print $ runParser ((,) <$> myParser <*> getState)  False "" "ab"
        print $ runParser ((,) <$> myParser <*> getState)  False "" "ac"
    -- results:
    -- Right ('b',True)
    -- Right ('c',False)
    

    Of course, instead of a simple boolean flag, it would be better to put more information into the state.

    Notice that state set by a sub-parser is "forgotten" if the sub-parser backtracks. That is the correct behavior for our purposes: otherwise, we would get "false positives" triggered by branches that are ultimately discarded.


    A common alternative to parsec is megaparsec. The latter doesn't allow for user-defined state in the parser type itself, but it can be emulated using a StateT transformer over the ParsecT type.