Search code examples
haskellparsecparser-combinators

parsec: string choice parser with useful error messages


Let's have following parser:

parser :: GenParser Char st String
parser = choice (fmap (try . string) ["head", "tail", "tales"]
                    <?> "expected one of ['head', 'tail', 'tales']")

When we parse the malformed input "ta" it will return the defined error but because of backtracking it will also talk about unexpected "t" at first position instead of unexpected " " at position 3.

Is there an easy (or built-in) way of matching one of multiple expected strings that produces good error messages? I am talking about showing the correct position and in this case something like expected "tail" or "tales" instead of our hard-coded error message.


Solution

  • It's not hard to cook up a function which does this correctly. We'll just rip one character off at a time, using Data.Map to find the shared suffixes:

    {-# LANGUAGE FlexibleContexts #-}
    import Control.Applicative
    import Data.Map hiding (empty)
    import Text.Parsec hiding ((<|>))
    import Text.Parsec.Char
    
    -- accept the empty string if that's a choice
    possiblyEmpty :: Stream s m Char => [String] -> ParsecT s u m String
    possiblyEmpty ss | "" `elem` ss = pure ""
                     | otherwise    = empty
    
    chooseFrom :: Stream s m Char => [String] -> ParsecT s u m String
    chooseFrom ss
         =  foldWithKey (\h ts parser -> liftA2 (:) (char h) (chooseFrom ts) <|> parser)
                        empty
                        (fromListWith (++) [(h, [t]) | h:t <- ss])
        <|> possiblyEmpty ss
    

    We can verify in ghci that it succesfully matches "tail" and "tales", and that it asks for i or l after a failed parse starting with ta:

    *Main> parse (chooseFrom ["head", "tail", "tales"]) "" "tail"
    Right "tail"
    *Main> parse (chooseFrom ["head", "tail", "tales"]) "" "tales"
    Right "tales"
    *Main> parse (chooseFrom ["head", "tail", "tales"]) "" "tafoo"
    Left (line 1, column 3):
    unexpected "f"
    expecting "i" or "l"