Search code examples
image-processingimagemagickphotographic

How can I automatically determine whether an image file depicts a photo or a 'graphic'?


How can I automatically determine whether an image file depicts a photo or a 'graphic'?

For example using Imagemagick?


Solution

  • I am somewhat at the limits of my knowledge here, but I read a paper and have worked out a way to calculate image entropy with ImageMagick - some clever person might like to check it!

    #!/bin/bash
    image=$1
    # Get number of pixels in image
    px=$(convert -format "%w*%h\n" "$image" info:|bc)
    # Calculate entropy 
    # See this paper www1.idc.ac.il/toky/imageProc-10/Lectures/04_histogram_10.ppt
    convert "$image" -colorspace gray -depth 8 -format "%c" histogram:info:- | \
       awk -F: -v px=$px '{p=$1/px;e+=-p*log(p)} END {print e}'
    

    So, you would save the script above as entropy, then do the following once to make it executable:

    chmod +x entropy
    

    Then you can use it like this:

    entropy image.jpg
    

    It does seem to produce bigger numbers for true photos and lower numbers for computer graphics.

    Another idea would be to look at the inter-channel correlation. Normally, on digital photos, the different wavelengths of light are quite strongly correlated with each other, so if the red component increases the green and the blue components tend to also increase, but if the red component decreases, both the green and the blue tend to also decrease. If you compare that to computer graphics, people tend to do their graphics with big bold primary colours, so a big red bar-graph or pie-chart graphic will not tend to be at all correlated between the channels. I took a digital photo of a landscape and resized it to be 1 pixel wide and 64 pixels high, and I am showing it using ImageMagick below - you will see that where red goes down so do green and blue...

    convert DSC01447.JPG -resize 1x64! -depth 8 txt:
    
    0,0: (168,199,235)  #A8C7EB  srgb(168,199,235)
    0,1: (171,201,236)  #ABC9EC  srgb(171,201,236)
    0,2: (174,202,236)  #AECAEC  srgb(174,202,236)
    0,3: (176,204,236)  #B0CCEC  srgb(176,204,236)
    0,4: (179,205,237)  #B3CDED  srgb(179,205,237)
    0,5: (181,207,236)  #B5CFEC  srgb(181,207,236)
    0,6: (183,208,236)  #B7D0EC  srgb(183,208,236)
    0,7: (186,210,236)  #BAD2EC  srgb(186,210,236)
    0,8: (188,211,235)  #BCD3EB  srgb(188,211,235)
    0,9: (190,212,236)  #BED4EC  srgb(190,212,236)
    0,10: (192,213,234)  #C0D5EA  srgb(192,213,234)
    0,11: (192,211,227)  #C0D3E3  srgb(192,211,227)
    0,12: (191,208,221)  #BFD0DD  srgb(191,208,221)
    0,13: (190,206,216)  #BECED8  srgb(190,206,216)
    0,14: (193,207,217)  #C1CFD9  srgb(193,207,217)
    0,15: (181,194,199)  #B5C2C7  srgb(181,194,199)
    0,16: (158,167,167)  #9EA7A7  srgb(158,167,167)
    0,17: (141,149,143)  #8D958F  srgb(141,149,143)
    0,18: (108,111,98)  #6C6F62  srgb(108,111,98)
    0,19: (89,89,74)  #59594A  srgb(89,89,74)
    0,20: (77,76,61)  #4D4C3D  srgb(77,76,61)
    0,21: (67,64,49)  #434031  srgb(67,64,49)
    0,22: (57,56,43)  #39382B  srgb(57,56,43)
    0,23: (40,40,34)  #282822  srgb(40,40,34)
    0,24: (39,38,35)  #272623  srgb(39,38,35)
    0,25: (38,37,37)  #262525  srgb(38,37,37)
    0,26: (40,39,38)  #282726  srgb(40,39,38)
    0,27: (78,78,57)  #4E4E39  srgb(78,78,57)
    0,28: (123,117,90)  #7B755A  srgb(123,117,90)
    0,29: (170,156,125)  #AA9C7D  srgb(170,156,125)
    0,30: (168,154,116)  #A89A74  srgb(168,154,116)
    0,31: (153,146,96)  #999260  srgb(153,146,96)
    0,32: (156,148,101)  #9C9465  srgb(156,148,101)
    0,33: (152,141,98)  #988D62  srgb(152,141,98)
    0,34: (151,139,99)  #978B63  srgb(151,139,99)
    0,35: (150,139,101)  #968B65  srgb(150,139,101)
    0,36: (146,135,98)  #928762  srgb(146,135,98)
    0,37: (145,136,97)  #918861  srgb(145,136,97)
    0,38: (143,133,94)  #8F855E  srgb(143,133,94)
    0,39: (140,133,92)  #8C855C  srgb(140,133,92)
    0,40: (137,133,92)  #89855C  srgb(137,133,92)
    0,41: (136,133,91)  #88855B  srgb(136,133,91)
    0,42: (131,124,81)  #837C51  srgb(131,124,81)
    0,43: (130,121,78)  #82794E  srgb(130,121,78)
    0,44: (134,123,78)  #867B4E  srgb(134,123,78)
    0,45: (135,127,78)  #877F4E  srgb(135,127,78)
    0,46: (135,129,79)  #87814F  srgb(135,129,79)
    0,47: (129,125,77)  #817D4D  srgb(129,125,77)
    0,48: (106,105,65)  #6A6941  srgb(106,105,65)
    0,49: (97,99,60)  #61633C  srgb(97,99,60)
    0,50: (120,121,69)  #787945  srgb(120,121,69)
    0,51: (111,111,63)  #6F6F3F  srgb(111,111,63)
    0,52: (95,98,55)  #5F6237  srgb(95,98,55)
    0,53: (110,111,63)  #6E6F3F  srgb(110,111,63)
    0,54: (102,105,60)  #66693C  srgb(102,105,60)
    0,55: (118,120,66)  #767842  srgb(118,120,66)
    0,56: (124,124,68)  #7C7C44  srgb(124,124,68)
    0,57: (118,120,65)  #767841  srgb(118,120,65)
    0,58: (114,116,64)  #727440  srgb(114,116,64)
    0,59: (113,114,63)  #71723F  srgb(113,114,63)
    0,60: (116,117,64)  #747540  srgb(116,117,64)
    0,61: (118,118,65)  #767641  srgb(118,118,65)
    0,62: (118,117,65)  #767541  srgb(118,117,65)
    0,63: (114,114,62)  #72723E  srgb(114,114,62)
    

    Statistically, this is the covariance. I would tend to want to use red and green channels of a photo to evaluate this - because in a Bayer grid there are two green sites for each single red and blue site, so the green channel is averaged across the two and therefore least susceptible to noise. The blue is most susceptible to noise. So the code for measuring the covariance can be written like this:

    #!/bin/bash
    # Calculate Red Green covariance of image supplied as parameter
    image=$1
    convert "$image" -depth 8 txt: | awk ' \
        {split($2,a,",")
         sub(/\(/,"",a[1]);R[NR]=a[1];
         G[NR]=a[2];
         # sub(/\)/,"",a[3]);B[NR]=a[3]
        }
        END{
          # Calculate mean of R,G and B
          for(i=1;i<=NR;i++){
             Rmean=Rmean+R[i]
             Gmean=Gmean+G[i]
             #Bmean=Bmean+B[i]
          }
          Rmean=Rmean/NR
          Gmean=Gmean/NR
          #Bmean=Bmean/NR
          # Calculate Green-Red and Green-Blue covariance
          for(i=1;i<=NR;i++){
              GRcov+=(G[i]-Gmean)*(R[i]-Rmean)
              #GBcov+=(G[i]-Gmean)*(B[i]-Bmean)
          }
          GRcov=GRcov/NR
          #GBcov=GBcov/NR
          print "Green Red covariance: ",GRcov
          #print "GBcovariance: ",GBcov
        }'
    

    I did some testing and that also works quite well - however graphics with big white or black backgrounds appear to be well correlated too because red=green=blue on white and black (and all grey-toned areas) so you would need to be careful of them. That however leads to another thought, photos almost never have pure white or black (unless really poorly exposed) whereas graphics do have whit backgrounds, so another test you could use would be to calculate the number of solid black and white pixels like this:

    convert photo.jpg -colorspace gray -depth 8 -format %c histogram:info:-| egrep "\(0\)|\(255\)"
         2: (  0,  0,  0) #000000 gray(0)
       537: (255,255,255) #FFFFFF gray(255)
    

    This one has 2 black and 537 pure white pixels.

    I should imagine you probably have enough for a decent heuristic now!

    Following on from my comment, you can use these ImageMagick commands:

    # Get EXIF information
    identify -format "%[EXIF*]" image.jpg
    
    # Get number of colours
    convert image.jpg -format "%k" info:
    

    Other parameters may be suggested by other responders, and you can find most of that using:

    identify -verbose image.jpg