Search code examples
parallel-processingpngtiff

Convert 5 Mio tiff Files in png


i am thinking about the best and fastet way to convert 5 Mio tiff Files (in Folders Subfolders and SubSubfolders) in 5 Mio png Files (same directory). Is there any way to parallelise this job?

How could i check then if all files are converted?

ls *.tif | wc -l # compared to 
ls *.png | wc -l

but for every folder. Thanks. Marco


Solution

  • Your question is very vague on details, but you can use GNU Parallel and ImageMagick like this:

    find STARTDIRECTORY -iname "*.tif" -print0 | parallel -0 --dry-run magick {} {.}.png
    

    If that looks correct, I would make a copy of a few files in a temporary location and try it for real by removing the --dry-run. If it works ok, you can add --bar for a progress bar too.

    In general, GNU Parallel will keep N jobs running, where N is the number of CPU cores you have. You can change this with -j parameter.

    You can set up GNU Parallel to halt on fail, or on success, or a number of failures, or after currently running jobs complete and so on. In general you will get an error message if any file fails to convert but your jobs will continue till completeion. Run man parallel and search for --halt option.


    Note that the above starts a new ImageMagick process for each image which is not the most efficient although it will be pretty fast on a decent machine with good CPU, disk subsystem and RAM. You could consider using different tools such as vips if you feel like experimenting - there are a few ideas and benchmarks here.

    Depending on how your files are actually laid out, you might do better using ImageMagick's mogrify command, and getting GNU Parallel to pass as many files to each invocation as your maximum command line length permits. So, for example, if you had a whole directory of TIFFs that you wanted to make into PNGs, you can do that with a single mogrify like this:

    magick mogrify -format PNG *.tif
    

    You could pair that command with a find looking for directories something like this:

    find STARTDIRECTORY -type d -print0 | parallel -0 'cd {} && magick mogrify -format PNG *.tif`
    

    Or you could find TIFF files and pass as many as possible to each mogrify something like this:

    find STARTDIRECTORY -iname "*.tif" -print0 | parallel -0 -X magick mogrify -format PNG {}