Search code examples
haskellparsec

Parsec.Expr repeated Prefix/Postfix operator not supported


The documentation for Parsec.Expr.buildExpressionParser says:

Prefix and postfix operators of the same precedence can only occur once (i.e. --2 is not allowed if - is prefix negate).

and indeed, this is biting me, since the language I am trying to parse allows arbitrary repetition of its prefix and postfix operators (think of a C expression like **a[1][2]).

So, why does Parsec make this restriction, and how can I work around it?

I think I can move my prefix/postfix parsers down into the term parser since they have the highest precedence.

i.e.

**a + 1

is parsed as

(*(*(a)))+(1)

but what could I have done if I wanted it to parse as

*(*((a)+(1)))

if buildExpressionParser did what I want, I could simply have rearranged the order of the operators in the table.

Note See here for a better solution


Solution

  • I solved it myself by using chainl1:

    prefix  p = Prefix  . chainl1 p $ return       (.)
    postfix p = Postfix . chainl1 p $ return (flip (.))
    

    These combinators use chainl1 with an op parser that always succeeds, and simply composes the functions returned by the term parser in left-to-right or right-to-left order. These can be used in the buildExprParser table; where you would have done this:

    exprTable = [ [ Postfix subscr
                  , Postfix dot
                  ]
                , [ Prefix pos
                  , Prefix neg
                  ]
                ]
    

    you now do this:

    exprTable = [ [ postfix $ choice [ subscr
                                     , dot
                                     ]
                  ]
                , [ prefix $ choice [ pos
                                    , neg
                                    ]
                  ]
                ]
    

    in this way, buildExprParser can still be used to set operator precedence, but now only sees a single Prefix or Postfix operator at each precedence. However, that operator has the ability to slurp up as many copies of itself as it can, and return a function which makes it look as if there were only a single operator.