Search code examples
haskellunicodesurrogate-pairs

Write surrogate pairs to file using Haskell


This is the code I have:

import qualified System.IO as IO

writeSurrogate :: IO ()
writeSurrogate = do
  IO.writeFile "/home/sibi/surrogate.txt" ['\xD800']

Executing the above code gives error:

text-tests: /home/sibi/surrogate.txt: commitBuffer: invalid argument (invalid character)

The reason being is that it is prevented by the GHC itself as they are surrogate code points: https://github.com/ghc/ghc/blob/21f0f56164f50844c2150c62f950983b2376f8b6/libraries/base/GHC/IO/Encoding/Failure.hs#L114

I want to write some test files which needs to have that data. Right now, I'm using Python to achieve what I want - But I would love to know if there is an way (workaround using Haskell) to achieve this.


Solution

  • Sure, just write the bytes you want:

    import Data.ByteString as BS
    main = BS.writeFile "surrogate.txt" (pack [0xd8, 0x00])