my Haskell application reads input as a list of ByteString and I'm using Text.Regex.Posix.ByteString.regexec to find matches. Some input has a character code 253 (it's a 1/2 symbol in one IBM PC character set) and it seems that the pattern '.' (i.e., dot, "match any character") doesn't match it. Any way to make it match ?
This works for me on a Windows Haskell install:
> length $ ((pack ['\1'..'\253']) =~ "." :: [[ByteString]])
252
I.e. dot matches all characters in range including code 253.
Note that the library calls out to the underlying posix regex matcher, typically, I assume, from glibc
.
So I would imagine any issue you have would be with that precise underlying c implementation.
Something like Text.Regex.TDFA.ByteString
might give you clearer behavior in this case, since it is all in Haskell?