I am a Haskell beginner and still learning about monad transformers.
I am trying to use the streaming-bytestring library to read a binary file, process chunks of bytes, and print the result as each chunk is processed. I believe this is the popular streaming
library that provides an alternative to lazy bytestrings. It appears the authors copy-pasted the lazy bytestring documentation and added some arbitrary examples.
The examples mention runResourceT
without going into any discussion of what it is or how to use it. It appears that should use runResourceT
on any streaming-bytestring function that performs an action. That's fine, but what if I'm reading an infinite stream that processes chunks and prints them? Should I call runResourceT every time I want to process the chunk?
My code is something like this:
import qualified Data.ByteString.Streaming as BSS
import System.TimeIt
main = timeIt $ processByteChunks $ BSS.drop 100 $ BSS.readFile "filename"
and I'm unsure of how to organize processByteChunks
as a recursive function that iterates through the binary file.
If I call runResourceT
only once, it would read the infinite file BEFORE printing, right? That seems bad.
main = timeIt $ runResourceT $ processByteChunks $ BSS.drop 100 $ BSS.readFile "filename"
The ResourceT
monad just cleans up resources in a timely fashion when you're finished with them. In this case, it will ensure the file handle opened by BSS.readFile
is closed when the stream is consumed. (Unless the stream truly is infinite, in which case I guess it won't.)
In your application, you only want to call it once, since you don't want the file closed until you've read all the chunks. Don't worry -- it has nothing to do with the timing of output or anything like that.
Here's an example with a recursive processByteChunks
that should work. It will read lazily and generate output as chunks are lazily read:
import Control.Monad.IO.Class
import Control.Monad.Trans.Resource
import qualified Data.ByteString.Streaming as BSS
import qualified Data.ByteString as BS
import System.TimeIt
main :: IO ()
main = timeIt $ runResourceT $
processByteChunks $ BSS.drop 100 $ BSS.readFile "filename"
processByteChunks :: MonadIO m => BSS.ByteString m () -> m ()
processByteChunks = go 0 0
where go len nulls stream = do
m <- BSS.unconsChunk stream
case m of
Just (bs, stream') -> do
let len' = len + BS.length bs
nulls' = nulls + BS.length (BS.filter (==0) bs)
liftIO $ print $ "cumulative length=" ++ show len'
++ ", nulls=" ++ show nulls'
go len' nulls' stream'
Nothing -> return ()