Search code examples
pythonimagematrixlossless-compression

Compressing multi-spectral image


I need to compress multispectral images in a lossless way, all the classical compression algorithms (Zip,Rar,Gzip,Deflate etc...) don't give very good results, with compression rates of approximately 1.2 , does anyone know a specially made algorithm that can compress this particular type of images with compression rates of at least 1.5 ?

Possibly I would like to use the algorithm in python, my matrix containing the spectral values is contained in a numpy file

I have tried many algorithms, but none gave the desired result; unfortunately, I cannot implement one by myself ad hoc. The matrix used consists of 16bit unsigned integers.

This is a small part of the array, to give an example of the type of data inside, they are 16-bit unsigned integers.

The data in .npy and .txt format

Min value: 0 Max value: 65.536

To date, the most efficient algorithm has been Jpeg2000.

import numpy as np 
from imagecodecs import (
    jpeg2k_encode, jpeg2k_decode, jpeg2k_check, jpeg2k_version, JPEG2K
)


array = np.load("data.npy")
compressed_channels = []

for channel in array:

    compressed_channel = jpeg2k_encode(channel, level=0)
    compressed_channels.append(compressed_channel)


np.save("data"+"_compressed.npy", compressed_channels)

compressed_channels = np.load("data"+"_compressed.npy")

decompressed_channels = []
for compressed_channel in compressed_channels:
    decompressed_channel = jpeg2k_decode(compressed_channel)
    decompressed_channels.append(decompressed_channel)

decoded = np.array(decompressed_channels)


compression_ratio = array.nbytes / sum(len(c) for c in compressed_channels)

print(np.array_equal(decoded, array))

print("Original array size (bytes):", array.nbytes)
print("Compressed array size (bytes):", sum(len(c) for c in compressed_channels))
print("Compression ratio:", compression_ratio)


np.save("data"+"_decompressed.npy", decoded)

Thanks in advance


Solution

  • The only compression method I could find that yields a compression ratio greater than 1.5 for the provided 16-bit, 31 channel image is JPEG-XL, but only if the channels are treated as a spatial dimension, e.g. the (C, Y, X) image is stored as Y frames of X, C images:

    import os
    import numpy
    import imagecodecs
    data = numpy.load('data.npy')
    data = numpy.moveaxis(data, 0, -1)
    imagecodecs.imwrite('data.jxl', data)
    numpy.testing.assert_array_equal(data, imagecodecs.imread('data.jxl'))
    print(data.nbytes / os.path.getsize('data.jxl'))
    
    1.536
    

    JPEG-XL is a relatively new format. There are not too many software/libraries that can read it. For comparison, the runner-ups JPEG 2000 and (surprisingly) LZMA are around 1.3, APNG at 1.22.