I'm trying to compare two images and see if they are identical. They should have the same dimensions, may have the same size, but the content will change sometimes, I want to be able to detect it.
I have two ways of doing it in my case: One is to get the number of colors in each image. (In my case the number of colors change if the images are different)
Or to indeed compare the files using a image processor.
I've opted to use ruby-vips8
because it's known to be a lot faster than RMagick
, and in my case performance is a thing.
I made some scratching with the ruby-vips8
but I can't find a way to compare two images or to get the number of colors (so I can compare using this method).
Any help?
ruby-vips8 is a wrapper of libvips.
http://www.rubydoc.info/gems/ruby-vips8/0.1.0/Vips/ http://www.vips.ecs.soton.ac.uk/index.php?title=VIPS
UPDATE:
With the answer of the user Aetherus I just realized I don't even need ruby-vips8
to do such task. I'm comparing the files as String (as he suggested). It's working great for me and it's also really fast.
I don't marked his answer as the best because my question asked if it's possible to do so using the ruby-vips8
. Was a lib specific scenario so in such conditions the user894763 answer is more appropriated.
There must be hundreds of ways of measuring image similarity, it's a huge field. They vary (mostly) in what features of an image they try to consider.
A family of similarity measures are based on histograms, as Scott said. These techniques don't consider how your pixels are arranged spatially, so your two images could be considered the same if one has been rotated 45 degrees. They are also fast, since finding a histogram is quick.
A simple histogram matcher might be: find the histograms of the two input images, normalise (so the two hists have the same area ... this removes differences in image size), subtract, square and sum. Now a small number means a good match, larger numbers mean increasingly poor matches.
In ruby-vips this would be:
require 'vips'
a = Vips::Image.new_from_file ARGV[0], access: :sequential
b = Vips::Image.new_from_file ARGV[1], access: :sequential
# find hists, normalise, difference, square
diff_hist = (a.hist_find.hist_norm - b.hist_find.hist_norm) ** 2
# find sum of squares ... find the average, then multiply by the size of the
# histogram
similarity = diff_hist.avg * diff_hist.width * diff_hist.height
puts "similarity = #{similarity}"
On my desktop, this runs in about 0.5s for a pair of 2k x 3k JPEG images.
Many matchers are based on spatial distribution. A simple one is to divide the image into an 8x8 grid (like a chess-board), take the average pixel value in each square, then set that square to 0 or 1 depending on whether the average of the square is above or below the average of the whole image. This gives something like a fingerprint for the image which you can store neatly in a 64-bit int. It's insensitive to things like noise, scale changes or small rotations.
To test two images for similarity, XOR their fingerprints and count the number of set bits in the result. Again, 0 would be a perfect match, larger numbers would be less good.
In ruby-vips, you could code this as:
require 'vips'
a = Vips::Image.new_from_file ARGV[0], access: :sequential
# we need a mono image
a = a.colourspace "b-w"
# reduce to 8x8 with a box filter
a = a.shrink(a.width / 8, a.height / 8)
# set pixels to 0 for less than average, 255 for greater than average
a = a > a.avg
a.write_to_file ARGV[1]
Again, this runs in about 0.5s for a 2k x 3k JPEG.
Yet another family would be based on correlation, see spcor and friends. They might be more useful for finding a small area of an image.
Many fancier image similarity metrics will take a variety of algorithms, run them all, and use a set of weighting factors to compute an overall similarity measure.