Is there a way to do proper case folding with Parsec
(say I want a parser that behaves like stringCI
from Data.Attoparsec.Text
).
The code that does case insensitive parsing in Text.Parsec.Token
just uses char (toLower c) <|> char (toUpper c)
, but no proper case folding. So I'm puzzled whether this is possible at all.
Parsec doesn't have any functionality for this built-in, but you could implement it with e.g. foldCase
from the case-insensitive package and satisfy
in a loop. I'm not a Unicode expert, so I'm not sure what extra precautions you'd have to take to ensure correctness.
The text-icu package is recommended in the documentation of foldCase
if you need locale-sensitive conversions; it seems to be pretty comprehensive.