Search code examples
zlibmacos-high-sierra

zlib performance difference between Mac OS X system version and locally re-installed


I noticed there is an important performance difference between the zlib library available in the system, and the one I re-installed from source, although both are zlib version 1.2.11. I run Mac OS 10.13.6.

Here is my code for the benchmark :

#include <stdio.h>
#include <stdlib.h>

#ifdef LOCAL_ZLIB
#include "./zlib-1.2.11/zlib.h"
#else
#include <zlib.h>
#endif

int main(int argc, char *argv[])
{
    printf("zlib version  %s\n",zlibVersion());

    gzFile  testFile = gzopen(argv[1], "r");

    int buffsize = 1024*1024 ;
    char * buffer = (char *) calloc(buffsize,sizeof(char));

    while ( gzread(testFile,buffer,buffsize) >0 )
    {
        ;
    }

    free(buffer);
    gzclose(testFile);

}

It just decompress the file using gzread in the buffer.

Here is my test on a 300MB gzipped file :

Results using system version

wget ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR374/006/SRR3744956/SRR3744956_1.fastq.gz

gcc bench_zlib.c -O3 -o bench_zlib -lz
time ./bench_zlib SRR3744956_1.fastq.gz

Which gives :

zlib version  1.2.11

real    0m3.711s
user    0m3.599s
sys 0m0.105s

Results using local install of zlib

zlib recompiled, same version, linked in static mode :

wget https://www.zlib.net/zlib-1.2.11.tar.gz
tar -xzvf zlib-1.2.11.tar.gz 
cd zlib-1.2.11
./configure
make
cd..
gcc bench_zlib.c  ./zlib-1.2.11/libz.a -O3 -o bench_zlib -DLOCAL_ZLIB
time ./bench_zlib SRR3744956_1.fastq.gz

Which gives

zlib version  1.2.11

real    0m5.236s
user    0m5.113s
sys 0m0.112s

The version I re-compiled locally from sources is 40 % slower. Any explanation ?

Things are already checked :

  • I recompiled zlib using static or dynamic version, it is the same, always 40 % slower than system provided version.
  • I checked the OSX sources for zlib here https://opensource.apple.com/source/zlib/zlib-70/ , it appears to be the same zlib as provided on zlib website, no fancy re-optimization of the code ( although they just have sources up to Mac OS 10.13.3)

Is it possible the system version is compiled with some special options that make it faster ? ( but 40 % seems a lot, and the zlib library is compiled with -O3 mode already)


Solution

  • As pointed by Mark Adler in his comment, the code used in the macOS library must be different. The confusion comes from the fact that they did not change the library version string.

    I guess they use something similar to this version https://github.com/jtkukunas/zlib (1.2.11.1-motley), where the CRC computation are vectorized. Profiling showed that crc function is 9X faster in the apple zlib version compared to zlib 1.2.11. This performance is similar to zlib "1.2.11.1-motley".

    On a 4GB gzipped file, I have the following decompression times

    apple zlib 1.2.11  (dynamic zlib included in Mac OS 10.13.6) :   47.9 s
    vanilla zlib 1.2.11 (from zlib.net)                          :   70.8 s
    zlib 1.2.11.1-motley (from github.com/jtkukunas/zlib)        :   48.4 s
    

    Moreover, when using gzbuffer(testFile, 1 << 20); which increases the zlib buffer to 1MB, the apple zlib becomes a little bit faster than zlib 1.2.11.1-motley.

    apple zlib 1.2.11    :   43.9 s
    vanilla zlib 1.2.11  :   67.1 s
    zlib 1.2.11.1-motley :   48.3 s
    

    So I guess that on top of the vectorized CRC, they also have some other optimizations.