Search code examples
stringhaskellbytestring

What is the best way to convert String to ByteString


What is the best way to convert a String to a ByteString in Haskell?

My gut reaction to the problem is

import qualified Data.ByteString as B
import Data.Char (ord)

packStr = B.pack . map (fromIntegral . ord)

But this doesn't seem satisfactory.


Solution

  • Here is my cheat sheet for Haskell String/Text/ByteString strict/lazy conversion assuming the desired encoding is UTF-8. The Data.Text.Encoding library has other encodings available.

    Please make sure to not write (using OverloadedStrings):

    lazyByteString :: BL.ByteString
    lazyByteString = "lazyByteString ä ß" -- BAD!
    

    This will get encoded in an unexpected way. Try

    lazyByteString = BLU.fromString "lazyByteString ä ß" -- good
    

    instead.

    String literals of type 'Text' work fine with regard to encoding.

    Cheat sheet:

    import Data.ByteString.Lazy as BL
    import Data.ByteString as BS
    import Data.Text as TS
    import Data.Text.Lazy as TL
    import Data.ByteString.Lazy.UTF8 as BLU -- from utf8-string
    import Data.ByteString.UTF8 as BSU      -- from utf8-string
    import Data.Text.Encoding as TSE
    import Data.Text.Lazy.Encoding as TLE
    
    -- String <-> ByteString
    
    BLU.toString   :: BL.ByteString -> String
    BLU.fromString :: String -> BL.ByteString
    BSU.toString   :: BS.ByteString -> String
    BSU.fromString :: String -> BS.ByteString
    
    -- String <-> Text
    
    TL.unpack :: TL.Text -> String
    TL.pack   :: String -> TL.Text
    TS.unpack :: TS.Text -> String
    TS.pack   :: String -> TS.Text
    
    -- ByteString <-> Text
    
    TLE.encodeUtf8 :: TL.Text -> BL.ByteString
    TLE.decodeUtf8 :: BL.ByteString -> TL.Text
    TSE.encodeUtf8 :: TS.Text -> BS.ByteString
    TSE.decodeUtf8 :: BS.ByteString -> TS.Text
    
    -- Lazy <-> Strict
    
    BL.fromStrict :: BS.ByteString -> BL.ByteString
    BL.toStrict   :: BL.ByteString -> BS.ByteString
    TL.fromStrict :: TS.Text -> TL.Text
    TL.toStrict   :: TL.Text -> TS.Text
    

    Please +1 Peaker's answer, because he correctly deals with encoding.