Search code examples
c++boostgzipboost-iostreams

decompress multiple files in to one single file using boost


I have set of compressed files. I have to decompress all the files and create one big file. below code is working fine, but I don't want to use std::stringstream because the files are big and I don't want to create intermediate copies of the file content.

If I try to use boost::iostreams::copy(inbuf, tempfile); directly, it is closing the file(tmpfile) automatically. Is there any better way to copy the content ? or at least, can I avoid closing of this file automatically?

std::ofstream tempfile("/tmp/extmpfile", std::ios::binary);
for (set<std::string>::iterator it = files.begin(); it != files.end(); ++it)
{
    string filename(*it);
    std::ifstream gzfile(filename.c_str(), std::ios::binary);

    boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
    inbuf.push(boost::iostreams::gzip_decompressor());
    inbuf.push(gzfile);

    //closes tempfile automatically!!
    //boost::iostreams::copy(inbuf, tempfile); 

    std::stringstream out;
    boost::iostreams::copy(inbuf, out);
    tempfile << out.str();
}
tempfile.close();

Solution

  • I know there are ways to let Boost IOStreams know it shouldn't close streams. I suppose it requires you use boost::iostream::stream<> instead of std::ostream though.

    My simple workaround that appears to work was to use a temp std::ostream associated with a single std::filebuf object:

    #include <boost/iostreams/stream.hpp>
    #include <boost/iostreams/copy.hpp>
    #include <boost/iostreams/filtering_streambuf.hpp>
    #include <boost/iostreams/filter/gzip.hpp>
    #include <set>
    #include <string>
    #include <iostream>
    #include <fstream>
    
    int main() {
        std::filebuf tempfilebuf;
        tempfilebuf.open("/tmp/extmpfile", std::ios::binary|std::ios::out);
    
        std::set<std::string> files { "a.gz", "b.gz" };
        for (std::set<std::string>::iterator it = files.begin(); it != files.end(); ++it)
        {
            std::string filename(*it);
            std::ifstream gzfile(filename.c_str(), std::ios::binary);
    
            boost::iostreams::filtering_streambuf<boost::iostreams::input> inbuf;
            inbuf.push(boost::iostreams::gzip_decompressor());
            inbuf.push(gzfile);
    
            std::ostream tempfile(&tempfilebuf);
            boost::iostreams::copy(inbuf, tempfile); 
        }
        tempfilebuf.close();
    }
    

    Live On Coliru

    With sample data like

    echo a > a
    echo b > b
    gzip a b
    

    Generates extmpfile containing

    a
    b