Search code examples
haskellbytestring

Efficient Conversion of Bytestring to [Word16]


I am attempting to do a plain conversion from a bytestring to a list of Word16s. The implementation below using Data.Binary.Get works, though it is a performance bottleneck in the code. This is understandable as IO is always going to be slow, but I was wondering if there isn't a more efficient way of doing this.

getImageData' = do
    e <- isEmpty
    if e then return []
      else do
      w <- getWord16be
      ws <- getImageData'
      return $ w : ws

Solution

  • I suspect the big problem you're encountering with Data.Binary.Get is that the decoders are inherently much too strict for your purpose. This also appears to be the case for serialise, and probably also the other serialization libraries. I think the fundamental trouble is that while you know that the operation will succeed as long as the ByteString has an even number of bytes, the library does not know that. So it has to read in the whole ByteString before it can conclude "Ah yes, all is well" and construct the list you've requested. Indeed, the way you're building the result, it's going to build a slew of closures (proportional in number to the length) before actually doing anything useful.

    How can you fix this? Just use the bytestring library directly. The easiest thing is to use unpack, but I think you'll probably get slightly better performance like this:

    {-# language BangPatterns #-}
    module Wordy where
    import qualified Data.ByteString as BS
    import Data.ByteString (ByteString)
    import Data.Word (Word16)
    import Data.Maybe (fromMaybe)
    import Data.Bits (unsafeShiftL)
    
    toDBs :: ByteString -> Maybe [Word16]
    toDBs bs0
      | odd (BS.length bs0) = Nothing
      | otherwise = Just (go bs0)
      where
        go bs = fromMaybe [] $ do
          (b1, bs') <- BS.uncons bs
          (b2, bs'') <- BS.uncons bs'
          let !res = (fromIntegral b1 `unsafeShiftL` 8) + fromIntegral b2
          Just (res : go bs'')