I'm having trouble using Megaparsec 6's makeExprParser
helper. I can't seem to figure out how to bind both binary ^
and unary -
at the precedence levels I'd expect.
Using this makeExprParser
expression parser:
expressionParser :: Parser Expression
expressionParser =
makeExprParser termParser
[
[InfixR $ BinOp BinaryExp <$ symbol "^"],
[
Prefix $ MonOp MonoMinus <$ symbol "-",
Prefix $ MonOp MonoPlus <$ symbol "+"
],
[
InfixL $ BinOp BinaryMult <$ symbol "*",
InfixL $ BinOp BinaryDiv <$ symbol "/"
],
[
InfixL $ BinOp BinaryPlus <$ symbol "+",
InfixL $ BinOp BinaryMinus <$ symbol "-"
]
]
I would expect these tests to pass:
testEqual expressionParser "1^2" "(1)^(2)"
testEqual expressionParser "-1^2" "-(1^2)"
testEqual expressionParser "1^-2" "1^(-2)"
testEqual expressionParser "-1^-2" "-(1^(-2))"
That is, -1^-2
should parse as the same thing as -(1^(-2))
. This is how e.g. Python parses it:
>>> 2**-2
0.25
>>> -2**-2
-0.25
>>> -2**2
-4
and Ruby:
irb(main):004:0> 2**-2
=> (1/4)
irb(main):005:0> -2**-2
=> (-1/4)
irb(main):006:0> -2**2
=> -4
But this Megaparsec parser instead fails to parse 1^-2
at all, instead giving me the helpful error:
(TrivialError (SourcePos {sourceName = \"test.txt\", sourceLine = Pos 1, sourceColumn = Pos 3} :| []) (Just (Tokens ('-' :| \"\"))) (fromList [Tokens ('(' :| \"\"),Label ('i' :| \"nteger\")]))")
which I read to say "I could have taken any of these characters here, but that -
has me flummoxed".
If I adjust some of the precedence of the operator table like this (moving the exponent after the unary -):
expressionParser =
makeExprParser termParser
[
[
Prefix $ MonOp MonoMinus <$ symbol "-",
Prefix $ MonOp MonoPlus <$ symbol "+"
],
[InfixR $ BinOp BinaryExp <$ symbol "^"],
[
InfixL $ BinOp BinaryMult <$ symbol "*",
InfixL $ BinOp BinaryDiv <$ symbol "/"
],
[
InfixL $ BinOp BinaryPlus <$ symbol "+",
InfixL $ BinOp BinaryMinus <$ symbol "-"
]
]
then I no longer get a parse failure, but -1^2
incorrectly parses as (-1)^2
(instead of the correct -(1^2)
).
Here is a complete self-contained parser to show the problem (it requires HUnit and of course megaparsec):
module Hascas.Minimal where
import Data.Void (Void)
import Test.HUnit hiding (test)
import Text.Megaparsec hiding (ParseError)
import Text.Megaparsec.Char
import Text.Megaparsec.Expr
import qualified Text.Megaparsec as MP
import qualified Text.Megaparsec.Char.Lexer as L
data Expression
= Literal Integer
| MonOp MonoOperator Expression
| BinOp BinaryOperator Expression Expression
deriving (Read, Show, Eq, Ord)
data BinaryOperator
= BinaryPlus
| BinaryMinus
| BinaryDiv
| BinaryMult
| BinaryExp
deriving (Read, Show, Eq, Ord)
data MonoOperator
= MonoPlus
| MonoMinus
deriving (Read, Show, Eq, Ord)
type Parser a = Parsec Void String a
type ParseError = MP.ParseError (Token String) Void
spaceConsumer :: Parser ()
spaceConsumer = L.space space1 lineComment blockComment
where
lineComment = L.skipLineComment "//"
blockComment = L.skipBlockComment "/*" "*/"
lexeme :: Parser a -> Parser a
lexeme = L.lexeme spaceConsumer
symbol :: String -> Parser String
symbol = L.symbol spaceConsumer
expressionParser :: Parser Expression
expressionParser =
makeExprParser termParser
[
[InfixR $ BinOp BinaryExp <$ symbol "^"],
[
Prefix $ MonOp MonoMinus <$ symbol "-",
Prefix $ MonOp MonoPlus <$ symbol "+"
],
[
InfixL $ BinOp BinaryMult <$ symbol "*",
InfixL $ BinOp BinaryDiv <$ symbol "/"
],
[
InfixL $ BinOp BinaryPlus <$ symbol "+",
InfixL $ BinOp BinaryMinus <$ symbol "-"
]
]
termParser :: Parser Expression
termParser = (
(try $ Literal <$> L.decimal)
<|> (try $ parens expressionParser))
parens :: Parser a -> Parser a
parens x = between (symbol "(") (symbol ")") x
main :: IO ()
main = do
-- just to show that it does work in the + case:
test expressionParser "1+(-2)" $
BinOp BinaryPlus (Literal 1) (MonOp MonoMinus $ Literal 2)
test expressionParser "1+-2" $
BinOp BinaryPlus (Literal 1 ) (MonOp MonoMinus $ Literal 2)
-- but not in the ^ case
test expressionParser "1^-2" $
BinOp BinaryExp (Literal 1) (MonOp MonoMinus $ Literal 2)
test expressionParser "-1^2" $
MonOp MonoMinus $ BinOp BinaryExp (Literal 1) (Literal 2)
test expressionParser "-1^-2" $
MonOp MonoMinus $ BinOp BinaryExp (Literal 1) (MonOp MonoMinus $ Literal 2)
-- exponent precedence is weird
testEqual expressionParser "1^2" "(1)^(2)"
testEqual expressionParser "-1^2" "-(1^2)"
testEqual expressionParser "1^-2" "1^(-2)"
testEqual expressionParser "-1^-2" "-(1^(-2))"
testEqual expressionParser "1^2^3^4" "1^(2^(3^(4))))"
where
test :: (Eq a, Show a) => Parser a -> String -> a -> IO ()
test parser input expected = do
assertEqual input (Right expected) $ parse (spaceConsumer >> parser <* eof) "test.txt" input
testEqual :: (Eq a, Show a) => Parser a -> String -> String -> IO ()
testEqual parser input expected = do
assertEqual input (p expected) (p input)
where
p i = parse (spaceConsumer >> parser <* eof) "test.txt" i
Is it possible to get Megaparsec to parse these operators at the precedence levels that other languages do?
makeExprParser termParser [precN, ..., prec1]
will produce an expression parser that works in such a way that each level of precedence invokes the next-higher level of precedence. So if you'd manually define it, you'd have a rule for infix +
and -
, which uses the mult-and-div rule as the operands. That in turn would use the prefix rule as the operands and that would use the ^
rule as the operand. Finally the ^
rule uses termParser
for the operands.
The important thing to note here is that the ^
rule (or more generally: any rule with a higher precedence than the prefix operators) invokes a parser that won't accept prefix operators at the beginning. So prefix operators can't appear on the right of such operators (except inside parentheses).
What this basically means is that your use case is not supported by makeExprParser
.
To work around this, you can use makeExprParser
to only handle the infix operators with a lower precedence than the prefix operators and then handle the prefix operators and ^
manually, so that the right operand of ^
would "loop back" to the prefix operators. Something like this:
expressionParser =
makeExprParser prefixParser
[
[
InfixL $ BinOp BinaryMult <$ symbol "*",
InfixL $ BinOp BinaryDiv <$ symbol "/"
],
[
InfixL $ BinOp BinaryPlus <$ symbol "+",
InfixL $ BinOp BinaryMinus <$ symbol "-"
]
]
prefixParser =
do
prefixOps <- many prefixOp
exp <- exponentiationParser
return $ foldr ($) exp prefixOps
where
prefixOp = MonOp MonoMinus <$ symbol "-" <|> MonOp MonoPlus <$ symbol "+"
exponentiationParser =
do
lhs <- termParser
-- Loop back up to prefix instead of going down to term
rhs <- optional (symbol "^" >> prefixParser)
return $ maybe lhs (BinOp BinaryExp lhs) rhs
Note that, unlike makeExprParser
, this also allows multiple consecutive prefix operators (like --x
for double negation). If you don't want that, replace many
with optional
in the definition of prefixParser
.