I want to write an array to a file, compressing it as I go.
Later, I want to read the array from that file, decompressing it as I go.
Boost's Iostreams seems like a good way to go, so I built the following code. Unfortunately, the output and input data do not compare equal at the end. But they very nearly do:
Output Input
0.8401877284 0.8401880264
0.3943829238 0.3943830132
0.7830992341 0.7830989957
0.7984400392 0.7984399796
0.9116473794 0.9116470218
0.1975513697 0.1975509971
0.3352227509 0.3352229893
This suggests that the least significant byte of each float is getting changed, or something. The compression should be lossless, though, so this is not expected or desired. What gives?
//Compile with: g++ test.cpp --std=c++11 -lz -lboost_iostreams
#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/zlib.hpp>
#include <cstdlib>
#include <vector>
#include <iomanip>
int main()
using namespace std;
using namespace boost::iostreams;
const int NUM = 10000;
std::vector<float> data_out;
std::vector<float> data_in;
for(float i=0;i<NUM;i++)
ofstream file("/z/hello.z", ios_base::out | ios_base::binary);
filtering_ostream out;
for(const auto d: data_out)
ifstream file_in("hello.z", ios_base::in | ios_base::binary);
filtering_istream in;
for(float i=0;i<NUM;i++)
bool all_good=true;
for(int i=0;i<NUM;i++){
cout<<std::setprecision(10)<<data_out[i]<<" "<<data_in[i]<<endl;
all_good &= (data_out[i]==data_in[i]);
cout<<"Good? "<<(int)all_good<<endl;
And, yes, I very much prefer to use the stream operators in the way I do, rather than pushing or pulling an entire vector block at once.
As Dan Mašek pointed out in their answer, the <<
stream operator I was using was converting my floating-point data into a textual representation prior to compression. For some reason, I hadn't expected this.
Using the serialization library is one way to avoid this, but would introduce additional dependencies in addition to possible overhead.
Therefore, I have used a reinterpret_cast
on the floating-point data and the ostream::write()
method to write the data without conversion one character at a time. Reading uses a similar method. Efficiencies could be improved by increasing the number of characters written at a time.
#include <fstream>
#include <iostream>
#include <boost/iostreams/filtering_stream.hpp>
#include <boost/iostreams/filter/zlib.hpp>
#include <cstdlib>
#include <vector>
#include <iomanip>
int main()
using namespace std;
using namespace boost::iostreams;
const int NUM = 10000;
std::vector<float> data_out;
std::vector<float> data_in;
for(float i=0;i<NUM;i++)
ofstream file("/z/hello.z", ios_base::out | ios_base::binary);
filtering_ostream out;
char *dptr = reinterpret_cast<char*>(data_out.data());
for(int i=0;i<sizeof(float)*NUM;i++)
ifstream file_in("hello.z", ios_base::in | ios_base::binary);
filtering_istream in;
char *dptr = reinterpret_cast<char*>(data_in.data());
for(int i=0;i<sizeof(float)*NUM;i++)
bool all_good=true;
for(int i=0;i<NUM;i++){
cout<<std::setprecision(10)<<data_out[i]<<" "<<data_in[i]<<endl;
all_good &= (data_out[i]==data_in[i]);
cout<<"Good? "<<(int)all_good<<endl;