Search code examples
parsingf#parsecfparsec

Recursive grammars in FParsec


I've decided to check out FParsec, and tried to write a parser for λ expressions. As it turns out, eagerness makes recursive parsing difficult. How can I solve this?

Code:

open FParsec

type λExpr =
    | Variable of char
    | Application of λExpr * λExpr
    | Lambda of char * λExpr

let rec FV = function
    | Variable v -> Set.singleton v
    | Application (f, x) -> FV f + FV x
    | Lambda (x, m) -> FV m - Set.singleton x

let Λ0 = FV >> (=) Set.empty

let apply f p =
    parse
        { let! v = p
          return f v }

let λ e =

    let expr, exprR = createParserForwardedToRef()

    let var = lower |> apply Variable

    let app = tuple2 expr expr
                 |> apply Application

    let lam = pipe2 (pchar 'λ' >>. many lower)
                        (pchar '.' >>. expr) (fun vs e ->
                                                List.foldBack (fun c e -> Lambda (c, e)) vs e)

    exprR := choice [
                    lam
                    app
                    var
                    (pchar '(' >>. expr .>> pchar ')')
                    ]

    run expr e

Thanks!


Solution

  • As you pointed out, the problem is that your parser for application calls expr recursively and so there is an infinite loop. The parser needs to be written such that it always consumes some input and then decides what to do.

    In case of lambda calculus, the tricky thing is recognizing an application and a variable because if you have input like x... then the first character suggests it could be either of them.

    You can merge the rules for application and variable like this:

    let rec varApp = parse {
      let! first = lower |> apply Variable
      let! res = 
        choice [ expr |> apply (fun e -> Application(first, e))
                 parse { return first } ]
      return res }
    

    This first parses a variable and then either parses another expression (in which case it is an application) or it just returns the variable (if there is no expression following the variable). The rest of the rules are similar:

    and lam = 
      pipe2 (pchar 'λ' >>. many lower)
            (pchar '.' >>. expr) (fun vs e ->
        List.foldBack (fun c e -> Lambda (c, e)) vs e)
    and brac = pchar '(' >>. expr .>> pchar ')'
    and expr = parse.Delay(fun () ->
      choice 
        [ lam; varApp; brac ])
    

    I just avoided the need for explicit mutation by using parse.Delay() (which makes it possible to create recursive value references). In principle, it could be written as:

    and expr = parse {
      return! choice [ lam; varApp; brac ] }
    

    ...but for some reason, FParsec doesn't implement the ReturnFrom member that is needed if you want to use return! in computation expressions.