I am new to Haskell, and I have been trying to write a JSON parser using Parsec as an exercise. This has mostly been going well, I am able to parse lists and objects with relatively little code which is also readable (great!). However, for JSON I also need to parse primitives like
I was hoping to find ready to use parsers for things like these as part of Parsec. The closest I get is the Parsec.Tokens module (defines integer
and friends), but those parsers require a "language definition" that seems way beyond what I should have to make to parse something as simple as JSON -- it appears to be designed for programming languages.
So my questions are:
Are the functions in Parsec.Token the right way to go here? If so, how to make a suitable language definition?
Are "primitive" parsers for integers etc defined somewhere else? Maybe in another package?
Am I supposed to write these kinds of low-level parsers myself? I can see myself reusing them frequently... (obscure scientific data formats etc.)
I have noticed that a question on this site says Megaparsec has these primitives included [1], but I suppose these cannot be used with parsec.
Related questions:
Are the functions in Parsec.Token the right way to go here?
Yes, they are. If you don't care about the minutiae specified by a language definition (i.e. you don't plan to use the parsers which depend on them, such as identifier
or reserved
), just use emptyDef
as a default:
import Text.Parsec
import qualified Text.Parsec.Token as P
import Text.Parsec.Language (emptyDef)
lexer = P.makeTokenParser emptyDef
integer = P.integer lexer
As you noted, this feels unnecesarily clunky for your use case. It is worth mentioning that megaparsec (cf. Alec's suggestion) provides a corresponding integer
parser without the ceremony. (The flip side is that megaparsec doesn't try to bake in support for e.g. reserved words, but that isn't difficult to implement in the cases you actually need it.)