Search code examples
parsinghaskellerror-handlingparsecparse-error

How to return multiple parse failures within Parsec's monadic context?


I have a grammar I am parsing which consists of exactly two required and unique logical parts, Alpha and Beta. These parts can be defined in any order, Alpha before Beta or visa-vera. I would like to provide robust error messages for the less tech-savvy users.

In the example below there are cases where multiple parse failures exist. I concatenate the failure message Strings with the unlines function and pass the resulting concatenation into the fail combinator. This creates a ParseError value with a single Message value when parse is called on grammarDefinition.

Example Scenario:

import Data.Either                   (partitionEithers)
import Data.Set                      (Set)
import Text.Parsec                   (Parsec)
import Text.Parsec.Char
import Text.ParserCombinators.Parsec

data Result = Result Alpha Beta
type Alpha  = Set (Int,Float)
type Beta   = Set String

grammarDefinition :: Parsec String u Result
grammarDefinition = do
    segments <- partitionEithers <$> many segment
    _        <- eof
    case segments of
      (     [],      []) -> fail $ unlines [missingAlpha, missingBeta]
      (      _,      []) -> fail $ missingBeta
      (     [],       _) -> fail $ missingAlpha
      ((_:_:_), (_:_:_)) -> fail $ unlines [multipleAlpha, multipleBeta]
      (      _, (_:_:_)) -> fail $ multipleBeta
      ((_:_:_),       _) -> fail $ multipleAlpha
      (    [x],     [y]) -> pure $ Result x y
    where
      missingAlpha     = message "No" "alpha"
      missingBeta      = message "No" "beta"
      multipleAlpha    = message "Multiple" "alpha"
      multipleBeta     = message "Multiple" "beta"
      message x y      = concat [x," ",y," defined in input, ","exactly one ",y," definition required"]

-- Type signature is important!
segment :: Parsec String u (Either Alpha Beta)
segment = undefined -- implementation irrelevant

I would like the ParseError to contain multiple Message values in the case of multiple failures. This should be possible due to the existence of the addErrorMessage function. I am not sure hw to supply multiple failure within the Parsec monadic context, before the result is materialized by calling parse.

Example Function:

fails :: [String] -> ParsecT s u m a
fails = undefined -- Not sure how to define this!

How do I supply multiple Message values to the ParseError result within Parsec's monadic context?


Solution

  • I would recommend transitioning from Parsec to newer and more extensible Megaparsec library.

    This exact issue has been resolved since version 4.2.0.0.

    Multiple parse error Messages can easily be created with the following function:

    fails :: MonadParsec m => [String] -> m a
    fails = failure . fmap Message