Search code examples
c++zlibboost-iostreams

Run-time error reading a .gz file using boost::iostreams and zlib


I am trying to read a .gz file and print the text content on screen by using boost::iostreams. This is just a simple experiment to learn about this library, and I am using the "directors.list.gz" file from IMDb (ftp://ftp.fu-berlin.de/pub/misc/movies/database/) as my input file.

My code compiles, via MSVC-10, but the process aborts when executed. There's not much information from the error message except for the error code being R6010.

Can someone please point me a direction in terms of what may have caused this and how do I make this work?

This library looks pretty neat and I do hope to use it correctly. Thanks a lot for helping.

#include <fstream>                 // isalpha
#include <iostream>                // EOF
#include <boost/iostreams/categories.hpp> // input_filter_tag
#include <boost/iostreams/operations.hpp> // get
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/copy.hpp>
#include <boost/iostreams/device/file_descriptor.hpp>
#include <boost/iostreams/device/file.hpp>
#include <boost/iostreams/device/array.hpp>
#include <boost/iostreams/filter/zlib.hpp>


using namespace std;
namespace io = boost::iostreams;


int main() 
{

    if(true)
    {

        string infile_path = "c:\\Temp\\directors.list.gz";
        ifstream infile(infile_path, ios_base::in | ios_base::binary);
        io::filtering_streambuf<io::input> in; //filter
        in.push(io::zlib_decompressor());       
        in.push(infile); 

        //output to cout
        io::copy(in, cout);
    }

    return 0;
}

Solution

  • The gzip file format has an additional header around the zlib data, which zlib can't read.

    So you want to use boost's gzip_decompressor instead of zlib_decompressor.

    in.push(gzip_decompressor());
    

    Note you'll need to include boost/iostreams/filter/gzip.h instead of boost/iostreams/filter/zlib.h.

    Here's a working example of streaming a GZIP file:

    #include <fstream>
    #include <iostream>
    #include <boost/iostreams/filtering_streambuf.hpp>
    #include <boost/iostreams/filter/gzip.hpp>
    #include <boost/iostreams/copy.hpp>
    
    using namespace boost::iostreams;
    
    int main()
    {
        std::ifstream file("hello.gz", std::ios_base::in | std::ios_base::binary);
        filtering_streambuf < input > in;
        in.push(gzip_decompressor());
        in.push(file);
        boost::iostreams::copy(in, std::cout);
    }
    

    You'll find more information on specific boost::iostreams filters lurking here in boost's documentation: http://www.boost.org/doc/libs/1_46_1/libs/iostreams/doc/quick_reference.html#filters

    I also feel I should point out that your code didn't compile with gcc: in the C++ standard library, the ifstream constructor takes a const char *, not a std::string. (I'm not sure about Microsoft's version).