c++opencv image-processing computer-vision template-matching

Template Matching with tolerance in OpenCV

I'm using OpenCV and C++. I want to check if an image is part of another image and already have found a function called matchTemplate which is working. But what if the template image is a little bit differently? Is there a function or a way like matchTemplate that checks if a template is part of a source image, but with tolerance parameters like position, angle, size and maybe even deformation? Or do I need a completely different approach here than template matching?

Here's my code so far, which finds a template image in a source image, but without (or almost without) tolerance.

#include <opencv2\core\core.hpp>
#include <opencv2\highgui\highgui.hpp>
#include <opencv2\imgproc\imgproc.hpp>
#include <opencv2\highgui\highgui.hpp>
#include <iostream>
#include <stdio.h>

using namespace cv;
using namespace std;

/// Global Variables
Mat img; Mat templ; Mat result;
const char* image_window = "Source Image";
const char* result_window = "Result window";

int match_method;
int max_Trackbar = 5;

/// Function Headers
void MatchingMethod( int, void* );

/**
* @function main
*/
int main( int, char** argv )
{
  /// Load image and template
  img = imread( "a1.jpg", 1 );
  templ = imread( "a2.jpg", 1 );

  /// Create windows
  namedWindow( image_window, WINDOW_AUTOSIZE );
  namedWindow( result_window, WINDOW_AUTOSIZE );

  /// Create Trackbar
  const char* trackbar_label = "Method: \n 0: SQDIFF \n 1: SQDIFF NORMED \n 2: TM CCORR \n 3: TM CCORR NORMED \n 4: TM COEFF \n 5: TM COEFF NORMED";
  createTrackbar( trackbar_label, image_window, &match_method, max_Trackbar, MatchingMethod );

  MatchingMethod( 0, 0 );

  waitKey(0);
  return 0;
}

/**
* @function MatchingMethod
* @brief Trackbar callback
*/
void MatchingMethod( int, void* )
{
  /// Source image to display
  Mat img_display;
  img.copyTo( img_display );

  /// Create the result matrix
  int result_cols = img.cols - templ.cols + 1;
  int result_rows = img.rows - templ.rows + 1;

  result.create( result_cols, result_rows, CV_32FC1 );

  /// Do the Matching and Normalize
  matchTemplate( img, templ, result, match_method );
  normalize( result, result, 0, 1, NORM_MINMAX, -1, Mat() );

  /// Localizing the best match with minMaxLoc
  double minVal; double maxVal; Point minLoc; Point maxLoc;
  Point matchLoc;

  minMaxLoc( result, &minVal, &maxVal, &minLoc, &maxLoc, Mat() );


  /// For SQDIFF and SQDIFF_NORMED, the best matches are lower values. For all the other methods, the higher the better
  if( match_method == TM_SQDIFF || match_method == TM_SQDIFF_NORMED )
    { matchLoc = minLoc; }
  else
    { matchLoc = maxLoc; }

  /// Show me what you got
  rectangle( img_display, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );
  rectangle( result, matchLoc, Point( matchLoc.x + templ.cols , matchLoc.y + templ.rows ), Scalar::all(0), 2, 8, 0 );

  imshow( image_window, img_display );
  imshow( result_window, result );

  return;
}

The images I'm using in my code:

source image a1.jpg template image a2.jpg

Solution

You've identified the major limitation with template matching. It's very fragile to any deformation of the image. Template-matching works by sliding a template-sized box around the image, and checking the similarity between the template and the region inside the box. It checks similarity using a pixel-by-pixel comparison method, such as normalized cross-correlation. If you want to allow different sizes and rotations, you'll need to write a loop that scales the original template up or down, or rotates it. It gets really inefficient.

If you want to allow deformation, and also do a more efficient search at different scales and rotations, the standard method is SURF. It's very efficient, and quite accurate if your images have good resolution, which yours do. You can google tutorials and find sample code for finding objects using SURF. Basically SURF identifies keypoints (distinctive image regions) in the template and the image. Then, you find the region in the image with the largest number of keypoints which match the template. (If you're already doing this, and it's what you meant by "feature matching," then I think you're on the right track.)