I have an image where I need to detect an object as fast as possible. I also know that I only need to detect the object closest to the center.
for(x in width):
for(y in height):
value = calcSimilarity(inputImage, searchedImage, x, y)
matched[x][y] = value
After that, I have to loop through the resulting image and find the point closest to the center, which is all quite a waste.
coordsGen = new CoordsGen() // a class that generates specific coords for me
while(!coordsGen.stop):
x, y = coordsGen.next()
value = calcSimilarity(inputImage, searchedImage, x, y)
if(value > treshold)
return x, y
Basically what I need here is the calcSimilarity function. This would allow me to optimize the process greatly.
There are many choices of similarity scoring methods for template matching in general.*
OpenCV has 3 available template matching modes:
And in OpenCV each of those three have normed/scaled versions as well:
You can see the actual formulas used in the OpenCV docs under TemplateMatchModes though these agree with the general formulas you can find everywhere for the above methods.
You can code the template matching yourself instead of using OpenCV. However, note that OpenCV is optimized for these operations and in general is blazing fast at template matching. OpenCV uses a DFT to perform some of these computations to reduce the computational load. For e.g., see:
You can also use OpenCV's minMaxLoc()
to find the min/maximum value instead of looping through yourself. Also, you didn't specify how you're accessing your values but not all lookup methods are as fast as others. See How to scan images to see the fastest Mat
access operations. Spoiler: raw pointers.
The main speedup your optimization would look to give you is early termination of the function. However, I don't think you'll achieve faster times in general by coding it yourself, unless there's a significantly smaller subset of the original image that the template is usually in.
A better method to reduce search time if your images are very big would be to use a pyramid resolution approach. Basically, make template and search images 1/2 your image since, 1/2 of that, 1/2 of that, and so on. Then you start the template matching on a small 1/16 or whatever sized image and find the general location of the template. Then you do the same for the next image size up, but you only search a small subset around where your template was at the previous scale. Then each time you grow the image size closer to the original, you're only looking for small differences of a few pixels to nail down the position more accurately. The general location is first found with the smallest scaled image, which only takes a fraction of the time to find compared to the original image size, and then you simply refine it by scaling up.
* Note that OpenCV doesn't include other template matching methods which you may see elsewhere. In particular, OpenCV has a sum of square differences but no sum of absolute distances method. Phase differences are also used as a similarity metric, but don't exist in OpenCV. Either way, cross-correlation and sum of square differences are both extremely common in image processing and unless you have a special image domain, should work fine.