Search code examples
pythonimagecommand-lineimagemagick

Combine multiple images using command line tool


I'm trying to recombine multiple images into a single image using a command line tool like ImageMagick.

I have 30000 folders and In each folder there are nearly 50 images. The images are smaller tiles of a larger image that has been broken up into tiles. Each images is prefixed with their xy position

e.g. folder1/01-imagename01-imagename folder1/02-imagename folder1/03-imagename folder1/10-imagename and so on

example here.

00-zzi.....=x0-y0-z2.jpg
01-zzi.....=x0-y1-z2.jpg
02-zzi.....=x0-y2-z2.jpg
03-zzi.....=x0-y3-z2.jpg

each tile image is 512x512 and typically is less than 50kb

I'm trying to figure out if there's any way that the image magick composite capability is the right tool, or any other suggestions.

Thanks!

Debian GNU/Linux 11

identify image output: 
Image:
  Filename: 00-zzi3ZROdz3Lq8hTj7hy2ghoChBAv2D2-9bU_jPT-D4b_jXTraQfK81DEuQ=x0-y0-z2.jpg
  Format: JPEG (Joint Photographic Experts Group JFIF format)
  Mime type: image/jpeg
  Class: DirectClass
  Geometry: 512x512+0+0
  Units: Undefined
  Colorspace: sRGB
  Type: Grayscale
  Base type: Undefined
  Endianness: Undefined
  Depth: 8-bit
  Channel depth:
    red: 8-bit
    green: 8-bit
    blue: 8-bit
  Channel statistics:
    Pixels: 262144
    Red:
      min: 0  (0)
      max: 19 (0.0745098)
      mean: 3.06709 (0.0120278)
      standard deviation: 0.61571 (0.00241455)
      kurtosis: 28.6716
      skewness: 2.14642
      entropy: 0.251511
    Green:
      min: 0  (0)
      max: 19 (0.0745098)
      mean: 3.06709 (0.0120278)
      standard deviation: 0.61571 (0.00241455)
      kurtosis: 28.6716
      skewness: 2.14642
      entropy: 0.251511
    Blue:
      min: 0  (0)
      max: 19 (0.0745098)
      mean: 3.06709 (0.0120278)
      standard deviation: 0.61571 (0.00241455)
      kurtosis: 28.6716
      skewness: 2.14642
      entropy: 0.251511
  Image statistics:
    Overall:
      min: 0  (0)
      max: 19 (0.0745098)
      mean: 3.06709 (0.0120278)
      standard deviation: 0.61571 (0.00241455)
      kurtosis: 28.6717
      skewness: 2.14643
      entropy: 0.251511
  Colors: 18
  Histogram:
    560: (0,0,0) #000000 black
    2814: (1,1,1) #010101 srgb(1,1,1)
    15055: (2,2,2) #020202 srgb(2,2,2)
    212826: (3,3,3) #030303 grey1
    24467: (4,4,4) #040404 srgb(4,4,4)
    5004: (5,5,5) #050505 grey2
    896: (6,6,6) #060606 srgb(6,6,6)
    237: (7,7,7) #070707 srgb(7,7,7)
    113: (8,8,8) #080808 grey3
    72: (9,9,9) #090909 srgb(9,9,9)
    43: (10,10,10) #0A0A0A grey4
    24: (11,11,11) #0B0B0B srgb(11,11,11)
    14: (12,12,12) #0C0C0C srgb(12,12,12)
    7: (13,13,13) #0D0D0D grey5
    5: (14,14,14) #0E0E0E srgb(14,14,14)
    3: (15,15,15) #0F0F0F grey6
    1: (16,16,16) #101010 srgb(16,16,16)
    3: (19,19,19) #131313 srgb(19,19,19)
  Rendering intent: Perceptual
  Gamma: 0.454545
  Chromaticity:
    red primary: (0.64,0.33)
    green primary: (0.3,0.6)
    blue primary: (0.15,0.06)
    white point: (0.3127,0.329)
  Background color: white
  Border color: srgb(223,223,223)
  Matte color: grey74
  Transparent color: black
  Interlace: None
  Intensity: Undefined
  Compose: Over
  Page geometry: 512x512+0+0
  Dispose: Undefined
  Iterations: 0
  Compression: JPEG
  Quality: 90
  Orientation: Undefined
  Properties:
    date:create: 2022-09-07T00:51:18+00:00
    date:modify: 2022-09-07T00:51:18+00:00
    jpeg:colorspace: 2
    jpeg:sampling-factor: 2x2,1x1,1x1
    signature: ebb4af08227671b45fa62c44887f9b94a8a17d3a7d6c418c26be0e032b766359
  Artifacts:
    filename: 00-zzi3ZROdz3Lq8hTj7hy2ghoChBAv2D2-9bU_jPT-D4b_jXTraQfK81DEuQ=x0-y0-z2.jpg
    verbose: true
  Tainted: False
  Filesize: 3764B
  Number pixels: 262144
  Pixels per second: 67.2793MB
  User time: 0.000u
  Elapsed time: 0:01.003
  Version: ImageMagick 6.9.11-60 Q16 x86_64 2021-01-25 https://imagemagick.org

Hi Mark, thank you for your help so far. I have been doing some testing and very nearly there! I had to change the get list of images code to use egrep due to it not finding the files, i have changed to:

row=( $(ls | egrep *-y${y}-z 2> /dev/null) ) 

The final hurdle, is that when attempting to process a smaller directory of 10 folders as a test of parallel processsing,

find "tiled_images" -type d -print ./processOne {} 

It seems to not be printing the folder names after the command and showing:

find: paths must precede expression. 

Solution

  • As I see it, there are two aspects to this:

    • processing 30,000 directories
    • processing one directory

    IMHO, the best way to process 30,000 directories is in parallel, else you'll be there all day. So I would suggest to write the processing as a script that does one directory, passed as a single parameter, and then using a GNU Parallel job that processes all 30,000 directories, keeping all your CPU cores busy till all directories are done.

    So, if your directories are under a top-level directory called "tiled_images", and you save the script in the next part of my answer as processOne.sh, you could do this:

    find "tiled_images" -type d -print | parallel ./processOne {} 
    

    There are many options to GNU Parallel, here are a few of the most useful:

    • parallel --eta ... will show you the "Estimated Time of Arrival" of job completion

    • parallel --bar ... will show you a progress bar, and works with zenity

    • parallel --j 4 ... will run just 4 jobs at a time

    • parallel --j 50% ... will keep half your CPU cores busy


    Now to the processing of a single directory, whose name is passed as parameter:

    #!/bin/bash
    
    # Expect one parameter - the directory name
    [ $# -ne 2 ] || { >&2 echo "Usage: $0 DIRECTORY"; exit 1; }
    
    d=$1
    
    cd "$d" || { >&2 echo "ERROR: Directory $d does not exist"; exit 1; }
    
    # Assume no more than 100 rows of tiles since fewer than 50 images altogether, and presumably more than 1 image per row
    for ((y=0;y<100;y++)) ; do
    
       # Get list of images in this row
       row=( $(ls *-y${y}-z2.jpg 2> /dev/null) )
    
       # Break out of loop if no images
       [ -z "$row" ] && break
    
       # Formulate output filename for this row, being sure that it is zero-padded so the rows collate in correct order
       # Also, write to MPC, or Magick Pixel Cache format, which should be fastest to write and read later
       printf -v out "row-%02d.mpc" $y
    
       echo "Processing row: ${y}"
       echo "  concatenating: ${row[@]}"
       echo "  into: ${out}"
       magick "${row[@]}" +append "$out"
    done
    
    # Concatenate rows into result
    magick row-*mpc -append result.jpg
    
    # You should clean up here when it is tested
    # rm *.mpc *.cache
    

    You would then save this as processOne.sh and make it executable with:

    chmod +x processOne.sh
    

    Then test it on a single directory with:

    ./processOne SOME_DIRECTORY_CONTAINING_TILES
    

    Note that +append concatenates images side-by-side, whereas -append (different sign) concatenates images above-and-below each other.


    Note that you could speed this up by avoiding creation of intermediate files, but that might complicate things and make debugging harder. Just for reference, that would look something very much like this:

    ...
    ...
    
    # Assume no more than 100 rows of tiles since fewer than 50 images altogether, and presumable more than 1 image per row
    for ((y=0;y<100;y++)) ; do
    
       # Get list of images in this row
       row=( $(ls *-y${y}-z2.jpg 2> /dev/null) )
    
       # Break out of loop if no images
       [ -z "$row" ] && break
    
       # Append images to make a row and pass to outer `magick` command
       magick "${row[@]}" +append miff:-
    
    done | magick miff:- -append result.jpg