I'm having an issue trying to parse a homogeneous json-like array in FParsec. I've decomposed the problem to a short example that reproduces it.
#r @"..\packages\FParsec.1.0.2\lib\net40-client\FParsecCS.dll"
#r @"..\packages\FParsec.1.0.2\lib\net40-client\FParsec.dll"
open System
open FParsec
let test p str =
match run p str with
| Success(result, _, _) -> printfn "Success: %A" result
| Failure(errormsg, _, _) -> printfn "Failure: %s" errormsg
type CValue = CInt of int64
| CBool of bool
| CList of CValue list
let P_WHITESPACE = spaces
let P_COMMA = pstring ","
let P_L_SBRACE = pstring "[" .>> P_WHITESPACE
let P_R_SBRACE = P_WHITESPACE >>. pstring "]"
let P_INT_VALUE = pint64 |>> CInt
let P_TRUE = stringReturn "true" (CBool true)
let P_FALSE = stringReturn "false" (CBool false)
let P_BOOL_VALUE = P_TRUE <|> P_FALSE
let P_LIST_VALUE =
let commaDelimitedList ptype = sepBy (ptype .>> P_WHITESPACE) (P_COMMA .>> P_WHITESPACE)
let delimitedList = (commaDelimitedList P_INT_VALUE) <|> (commaDelimitedList P_BOOL_VALUE)
let enclosedList = between P_L_SBRACE P_R_SBRACE delimitedList
enclosedList |>> CList
When I use the test
function to try it out, I get the following results:
test P_LIST_VALUE "[1,2,3]"
Success: CList [CInt 1L; CInt 2L; CInt 3L]
test P_LIST_VALUE "[true,false]"
Failure: Error in Ln: 1 Col: 2
[true,false]
^
Expecting: integer number (64-bit, signed) or ']'
If I swap the order of P_INT_VALUE
and P_BOOL_VALUE
when using the <|>
operator, then [true,false]
parses successfully but [1,2,3]
fails with a similar error. So basically, what ever parser I use first is what it tries to use.
I understand the <|>
operator won't attempt the RHS parser if the LHS mutates the user state - but I can't see how that could be happening. P_BOOL_VALUE and P_INT_VALUE don't have any starting characters in common, so both should be failing immediately when trying to parse the wrong data type. Ints never start with 'false' or 'true' and bools never start with numeric digits.
What am I doing wrong?
Ah, I've figured it out. The hint in the error message is the or ']'
. The problem is that sepBy
succeeds on empty input, so when it hits the t
, it returns successfully with an empty list, and then control passes back to between
which tries and fails to find a terminating ]
.
The solution is to move the empty list case out of the int/bool-specific parsers, like this:
let P_LIST_VALUE =
let commaDelimitedList ptype = sepBy1 (ptype .>> P_WHITESPACE) (P_COMMA .>> P_WHITESPACE)
let delimitedList = (commaDelimitedList P_INT_VALUE) <|> (commaDelimitedList P_BOOL_VALUE) <|> preturn []
let enclosedList = between P_L_SBRACE P_R_SBRACE delimitedList
enclosedList |>> CList
Note the use of sepBy1
instead of sepBy
, and the addition of <|> preturn []
to handle the empty case only once in delimitedList
.
As a side-note, I don't know your exact application, but it is generally not such a good idea to enforce typing in the parser; a more common way to implement this would be to just parse a commaDelimitedList (P_INT_VALUE <|> P_BOOL_VALUE)
(with your original commaDelimitedList
) and then check the typing in a subsequent analysis phase.