Search code examples
c++filecompressionfstreamlz4

C++ compressing using lz4, compressed information not as expected


I'm using lz4 on mac and doing an experiment to compress a string (named str) in my program.

#include <fstream>
#include <iostream>
#include "lz4.h"
using namespace std;
int main(){
    char str[] = "10100100010000100000100000010000000100000000100000000010000000000";
    size_t len = sizeof(str);
    char* target = new char[len];
    int nCompressedSize = LZ4_compress_default((const char *)(&str), target, len, len);

    ofstream os("lz4.dat",ofstream::binary);
    os.write(target, nCompressedSize);
    os.close();
    delete[] target;
    target = 0;

    ifstream is( "lz4.dat", ifstream::binary );
    is.seekg (0,is.end);
    size_t nCompressedInputSize = is.tellg();
    is.clear();
    is.seekg(0,ios::beg);

    //Read file into buffer
    char* in = new char[nCompressedInputSize];
    int32_t n=is.read(in,nCompressedSize);
    cout<<"Byte number:"<<nCompressedSize<<",file size:"<<n<<",bytes read:"<<in<<endl;
    is.close();
    return 0;
}

Run this program, I checked the "lz4.dat" file:

$ls -lrt lz4.dat
-rw-r--r--  1 x  staff  34  7 15 14:50 lz4.dat

It's 34 bytes, OK, but the program output is:

Byte number:34,file size:1,bytes read:@1010

Very strange, seems the file size received is 1 byte, and I actually output some randome @1010. Why my "is.tellg()" didn't get correct file length?

Thanks.


Solution

  • ifstream::read() doesn't return the bytes read. It returns a reference to *this, which has operator bool(), which is used in case, I think. So you in n, you get whether the operation was succeeded.

    Output seems to be completely fine, it is the beginning of the compressed data. I think there is only several bytes printed, because it contains a terminating zero. And it resembles your input, because lz4 puts literals into the stream verbatim (lz4 doesn't have an entropy encoding)