Search code examples
f#fparsec

Trying to write an FParsec parser for a grammar with parens and function calls


I'm trying to write a parser in FParsec that can parse a grammar that supports brackets (parens) for both term grouping and function calls (much like most programming languages do today) but I've got stuck.

Here's an example of what I'd like to parse:

(A & B) | F(C)

and the expected output is

Or (And (Identifier "A", Identifier "B"), Call ("F", [Identifier "C"]))

I've written the code below that makes use of an OperatorPrecedenceParser:

type Expression =
    | Identifier of string
    | Call of Expression * Expression list
    | And of Expression * Expression
    | Or of Expression * Expression

let identifier = many1Satisfy isLetter

let expr, exprRef = createParserForwardedToRef()

let callExpr = 
    pipe2 
        (identifier .>> spaces .>> pchar '(' .>> spaces) 
        (sepBy expr (pchar ',' .>> spaces)) 
        (fun name args -> Call(name, args))

let atomExpr = 
    choice [
        callExpr;
        identifier |>> Identifier;
        between (pchar '(' .>> spaces) (pchar ')' .>> spaces) expr
    ]

let opp = new OperatorPrecedenceParser<Expression, unit, unit>()

opp.AddOperator(InfixOperator("|", spaces, 10, Associativity.Left, fun l r -> Or(l, r)))
opp.AddOperator(InfixOperator("&", spaces, 20, Associativity.Left, fun l r -> And(l, r)))

exprRef.Value <- attempt atomExpr <|> opp.ExpressionParser

run expr "(A & B) | F(C)"
|> printfn "%A"

but when I run it I get a NullReferenceException! Dotnet fiddle here: https://dotnetfiddle.net/ARzBCT

I suspect this has something to do with the forwarded ref (necessary because of the recursive grammar), but I'm not really sure where I've gone wrong; I'm reasonably experienced in F# but very new to the world of parsing.

PS: A more complicated example might be:

(A | B)(C & D(E, F))

Call (
    Or (Identifier "A", Identifier "B"), 
    [
        And (
            Identifier "C", 
            Call (Identifier "D", [Identifier "E", Identifier "F"])
        )
    ]
)

Solution

  • In order to use OperatorPrecedenceParser, you must distinguish between "terms" and "expressions". Terms are the things that are separated by operators. You then must set the OPP's TermParser member, which is null by default (hence the error message you're seeing). This article contains more information.

    For example, you have a parser called atomExpr that seems to correspond to a term, so you could do this:

    let opp = new OperatorPrecedenceParser<Expression, unit, unit>(TermParser = atomExpr)
    

    When I try that, your parser no longer crashes, although it still doesn't successfully parse the input. I haven't tried to debug your code beyond that.

    Side note: I would also consider using FParsec's chainl and chainr parsers, rather than OperatorPrecedenceParser.