Search code examples
three.jsrastertiffgdal

How to make a significant compression of a Float32 raster file with gdal


The question is pretty vague but that is what I mean. I have a 110GB raster file which is a combination of 20 other files resulting in a .tif file of whole Europe. I want to use its binary type in order to make a 3D terrain such as in this blog but the problem is its size. I have tried a few techniques with gdal such as:

gdal_translate 
  -co COMPRESS=JPEG 
  -co TILED=YES 
  input.tif output.tif

but apparently JPEG compression does not work with Float32 files. I prefer to save the negative values of the Float32 format but a 16 bit integer is also fine (which excludes all the Byte compressions). The best compression for Float32 files I found so far is using a PREDICTOR=3 as a compression expression but it is far from enough.

Is it even possible to achieve such a compression as I am describing it with gdal? What about other techniques?


Solution

  • As you've discovered, JPEG compression doesn't work for Floats (in fact, it's intended for integer RGB only since it lossily removes what is visually less perceptible).

    PREDICTOR=3 with COMPRESS=ZSTD and a high ZSTD_LEVEL is probably the best you can do truly lossless.

    However, if you can tolerate a certain specific imprecision, and your GDAL version is at least 2.4, you can use COMPRESS=LERC or LERC_ZSTD with MAX_Z_ERROR set to the maximum imprecision (error through compression) that you can tolerate.

    The implementation of LERC is slightly buggy in GDAL and can choke on NaN's, and maybe NODATAs or infs. https://github.com/OSGeo/gdal/issues/3055 claims this was fixed in Oct 2020 but it still affected me today. As a workaround, you could convert to an Int16 by multiplying by scaling up by a fixed multiplier (representing 1) so that +-1 is within your target precision, and then using COMPRESS=ZSTD and PREDICTOR=2. Or even LERC again on the integer raster with an integer maximum Z error if you wish.

    Example I just did:

    gdal_calc -A "D2014 NDVI.tif" --outfile=out.tif --type=Int16 --calc=numpy.round(1000.0*A) 
    gdal_translate -co "COMPRESS=LERC_ZSTD" -co "ZSTD_LEVEL=15" -co "MAX_Z_ERROR=5" -co "PREDICTOR=2" -co "TILED=YES" out.tif "D2014 NDVI lerc.tif"
    

    You could doubtless combine into one line. For me this reduced a 600 Mb Float32 TIF into a 120 Mb Int16 one, with integer scaled NDVIs -1000-1000, max error 5 (i.e. 0.005 before scaling). This is inline with what I've achieved with applying LERC_ZSTD to Float32 where it didn't choke.