I've just started learning about parsing, and I wrote this simple parser in Haskell (using parsec) to read JSON and construct a simple tree for it. I am using the grammar in RFC 4627.
However, when I try parsing the string {"x":1 }
, I'm getting the output:
parse error at (line 1, column 8): unexpected "}" expecting whitespace character or ","
This only seems to be happening when I have spaces before a closing brace (]) or mustachio (}).
What have I done wrong? If I avoid whitespace before a closing symbol, it works perfectly.
Parsec doesn't do rewinding and backtracking automatically. When you write sepBy member valueSeparator
, the valueSeparator
consumes white space, so the parser will parse your value like so:
{"x":1 }
[------- object
% beginObject
[-] name
% nameSeparator
% jvalue
[- valueSeparator
X In valueSeparator: unexpected "}"
Legend:
[--] full match
% full char match
[-- incomplete match
X incomplete char match
When the valueSeparator
fails, Parsec won't go back and try a different combination of parses, because one character has already matched in valueSeparator
.
You have two options to solve your problem:
tok
should only consume white space after the char, so its definition is tok c = char c *> ws
((*>)
from Control.Applicative
); apply the same rule to all the other parsers. Since you'll never consume white space after having entered the "wrong parser" that way, you won't end up having to back-track.try
in front of parsers that might consume more than one character, and that should rewind their input if they fail.EDIT: updated ASCII graphic to make more sense.