This is an evolution of this question.
I need to parse with megaparsec a data structure like
data Foo =
Simple String
Dotted Foo String
Paren String Foo
and I would like to parse to it strings like
foo ::= alphanum
| foo "." alphanum
| alphanum "(" foo ")"
For example a the string "a(b.c).d"
should be parsed to Dotted (Paren "a" (Dotted (Simple "b") "c")) "d"
.
The problem I have is that this is at the same time left and right recursive.
I have no problems writing the parsers for the first and the third case:
parser :: Parser Foo
parser
= try (do
prefix <- alphanum
constant "("
content <- parser
constant ")"
pure $ Paren prefix content
)
<|> Simple alphanum
but I'm not able to put together also the parser for the second case. I tried to approach it with sepBy1
or with makeExprParser
but I couldn't get it right
To factor out the left recursion in this:
foo ::= alphanum
| foo "." alphanum
| alphanum "(" foo ")"
You can start by rewriting it to this:
foo ::= alphanum ("(" foo ")")?
| foo "." alphanum
Then you can factor out the left recursion using the standard trick of replacing:
x ::= x y | z
With:
x ::= z x'
x' ::= y x' | ∅
In other words:
x ::= z y*
With x
= foo
, y
= "." alphanum
, and z
= alphanum ("(" foo ")")?
, that becomes:
foo ::= alphanum ("(" foo ")")? ("." alphanum)*
Then I believe your parser can just be something like this, since ?
~ zero or one ~ Maybe
~ optional
and *
~ zero or more ~ []
~ many
:
parser = do
prefix <- Simple <$> alphanum
maybeParens <- optional (constant "(" *> parser <* constant ")")
suffixes <- many (constant "." *> alphanum)
let
prefix' = case maybeParens of
Just content -> Paren prefix content
Nothing -> prefix
pure $ foldl' Dotted prefix' suffixes