I'd like to use Conduit
in a setting where I read a binary file, check that it has the correct header, and then work on the remaining data in the file.
Trying to write a conduit that checks the header and then streams the rest of the data on to the following conduits I run into trouble. I have them live in a Either String
monad for some exception handling. Here's a simplified version of the code (I'm aware there's a Condiut.Attoparsec
module, but for now I'd like to write it myself):
import Conduit (ConduitM, mapC, mapM_C, takeWhileCE)
import Data.ByteString.Conversion (toByteString')
separator :: ByteString
separator = toByteString' '#'
check :: ByteString -> Either String ()
confirmHeader :: ConduitM ByteString ByteString (Either String) ()
confirmHeader = do
takeWhileC (/= separator) .| mapM_C check
mapC id
separator
is a predefined ByteString
that signals the end of the header. The line mapC id
is supposed to pass on the rest of the stream if the header checks out. I left out the nonimportant details of check
.
The part checking the header works. The last line, however, apart from looking inelegant and non-idiomatic, doesn't work. Running something like
runConduit $ yield (toByteString' "header#rest") .| confirmHeader .| sinkList
Gives Right []
rather than Right ["rest"]
, as I had hoped. Any ideas?
Your takeWhileC (/= separator)
is taking the whole ByteString
: it's not working on chunks of ByteString
s! You can use Data.Conduit.Binary
to work on individual bytes of the stream. The below code works "as expected" I believe.
module Main (main) where
import Conduit
import Data.ByteString (ByteString)
import Data.ByteString.Conversion (toByteString')
import Data.Char (ord)
import qualified Data.Conduit.Binary as B
import GHC.Word (Word8)
separator :: Word8
separator = toEnum $ ord '#'
check :: ByteString -> Either String ()
check _ = Right ()
confirmHeader :: ConduitM ByteString ByteString (Either String) ()
confirmHeader = do
B.takeWhile (/= separator) .| mapM_C check
B.drop 1 -- drop separator which stayed in stream
mapC id
main :: IO ()
main = print . runConduit $
yield (toByteString' "header#rest") .| confirmHeader .| sinkList
And the output:
[nix-shell:/tmp]$ ghc C.hs -fforce-recomp -Wall -Werror -o Main && ./Main
[1 of 1] Compiling Main ( C.hs, C.o )
Linking Main ...
Right ["rest"]