Search code examples
parsinghaskellreadfilebytestring

How to parse a 7GB file, with Data.ByteString?


I have to parse a file, and indeed a have to read it first, here is my program :

import qualified Data.ByteString.Char8 as B
import System.Environment    

main = do
 args      <- getArgs
 let path  =  args !! 0
 content   <- B.readFile path
 let lines = B.lines content
 foobar lines 

 foobar :: [B.ByteString] -> IO()
 foobar _ = return ()

but, after the compilation

> ghc --make -O2 tmp.hs 

the execution goes through the following error when called with a 7Gigabyte file.

> ./tmp  big_big_file.dat
> tmp: {handle: big_big_file.dat}: hGet: illegal ByteString size (-1501792951): illegal operation

thanks for any reply!


Solution

  • Strict ByteStrings only support up to 2 GiB of memory. You need to use lazy ByteStrings for it to work.