I have a program that processes huge RGB images in the range of 30000x30000 px.
To load I use Pillow, which works good. Then I process it with NumPy and then I need to save it lossless as tiff.
However, whether I'm using Pillow or OpenCV, this takes very long compared to the runtime of all the other stuff. I think this is because of the image compression. Without compression, the saving does not take long at all but my files are >2 GB. I found the module tifffile but it takes just as long as OpenCV, unless I missed a parameter.
Is there a module that can compress faster? The ones I tried only use one CPU core.
It also seems, that it's faster on an intel machine with i7-9700k 16GB than on my PC with AMD Ryzen 5600X 32GB?
Here is the code I used to test:
from PIL import Image
import cv2
import tifffile
import numpy as np
import time
arr = np.random.default_rng().integers(0, 255, size=(30000,30000,3), endpoint=True, dtype=np.uint8)
st = time.time()
Image.fromarray(arr).save("test_pil.tiff", compression="tiff_adobe_deflate")
print(f"Pil took {time.time()-st} s")
st = time.time()
cv2.imwrite("test_cv2.tiff", arr, params=(cv2.IMWRITE_TIFF_COMPRESSION, 32946))
print(f"Opencv took {time.time()-st} s")
st = time.time()
tifffile.imwrite("test_tifff.tiff", arr, compression="zlib", compressionargs={'level':5}, predictor=True, tile=(64,64))
print(f"Tifffile took {time.time()-st} s")
I know these also use different compression algorithms, but I haven't found matching parameters. This feature is generally very poorly documented.
Result (intel):
Pil took 32.01173210144043 s
Opencv took 60.46461296081543 s
Tifffile took 59.410102128982544 s
Following @Mark Setchell and @cgohlkes advice, I specified the tile size to 256x256 for tifffile and that improved the processing speed by a lot.
tifffile.imwrite(target_path, sat_map, compression="zlib", compressionargs={'level':5}, predictor=True, tile=(256,256))