Search code examples
f#fparsec

How to parse recusrive grammar in FParsec


Previous questions which I could not use to get this to work

Recursive grammars in FParsec

  • Seems to be an old question which was asked before createParserForwardedToRef was added to FParsec
  • AST doesn't seem to be as horribly recursive as mine.

Parsing in to a recursive data structure

  • Grammar relies on a special character '[' to indicate another nesting level. I don't have this luxury

I want to build a sort of Lexer and project system for a language I have found myself writing lately. The language is called q. It is a fairly simple language and has no operator precedence. For example 1*2+3 is the same as (1*(2+3)). It works a bit like a reverse polish notation calculator, evaluation is right to left.

I am having trouble expressing this in FParsec. I have put together the following simplified demo

open FParsec

type BinaryOperator = BinaryOperator of string
type Number = Number of string

type Element =
  |Number of Number
and Expression = 
  |Element of Element
  |BinaryExpression of Element * BinaryOperator * Expression

let number = regex "\d+\.?\d*" |>> Number.Number

let element = [ number ] |> choice |>> Element.Number

let binaryOperator = ["+"; "-"; "*"; "%"] |> Seq.map pstring |> choice |>> BinaryOperator

let binaryExpression expression = pipe3 element binaryOperator expression (fun l o r -> (l,o,r))
let expression =

  let exprDummy, expRef = createParserForwardedToRef()
  let elemExpr = element |>> Element
  let binExpr = binaryExpression exprDummy |>> BinaryExpression
  expRef.Value <- [binExpr; elemExpr; ] |> choice
  expRef

let statement = expression.Value .>> eof

let parseString s =
  printfn "Parsing input: '%s'" s

  match run statement s with
  | Success(result, _, _)   -> printfn "Ok: %A" result
  | Failure(errorMsg, _, _) -> printfn "Error: %A" errorMsg

//tests
parseString "1.23"
parseString "1+1"
parseString "1*2+3" // equivalent to (1*(2+3))

So far, I haven't been able to come up with a way to satisfy all 3 tests cases. In the above, it tries to parse binExpr first, realises it can't, but then must be consuming the input because it doesn't try to evaluate elemExpr next. Not sure what to do. How do I satisfy the 3 tests?


Solution

  • Meditating on Tomas' answer, I have come up with the following that works

    let expr, expRef = createParserForwardedToRef()
    let binRightExpr = binaryOperator .>>. expr
    expRef.Value <- parse{
      let! first = element
      return! choice [
          binRightExpr |>> (fun (o, r) -> (first, o, r) |> BinaryExpression)
          preturn (first |> Element)
        ]
    }
    
    let statement = expRef.Value .>> eof
    

    The reason the first parser failed is given in the FParsec docs

    The behaviour of the <|> combinator has two important characteristics:

    <|> only tries the parser on the right side if the parser on the left side fails. It does not implement a longest match rule. However, it only tries the right parser if the left parser fails without consuming input.

    Probably need to clean up a few things like the structure of the AST but I think I am good to go.