matlab opencv image-processing computer-vision mex

Issues with imgIdx in DescriptorMatcher mexopencv

My idea is simple here. I am using mexopencv and trying to see whether there is any object present in my current that matches with any image stored in my database.I am using OpenCV DescriptorMatcher function to train my images. Here is a snippet, I am wishing to build on top of this, which is one to one one image matching using mexopencv, and can also be extended for image stream.

function hello

    detector = cv.FeatureDetector('ORB');
    extractor = cv.DescriptorExtractor('ORB');
    matcher = cv.DescriptorMatcher('BruteForce-Hamming');

    train = [];
    for i=1:3
        train(i).img = [];
        train(i).points = [];
        train(i).features = [];    
    end;

    train(1).img = imread('D:\test\1.jpg');
    train(2).img = imread('D:\test\2.png');
    train(3).img = imread('D:\test\3.jpg');


    for i=1:3

        frameImage = train(i).img;
        framePoints = detector.detect(frameImage);
        frameFeatures = extractor.compute(frameImage , framePoints);

       train(i).points = framePoints;
       train(i).features = frameFeatures;

    end;

    for i = 1:3 
        boxfeatures = train(i).features;
        matcher.add(boxfeatures);
    end;
    matcher.train();

    camera = cv.VideoCapture;
    pause(3);%Sometimes necessary 

    window = figure('KeyPressFcn',@(obj,evt)setappdata(obj,'flag',true));
    setappdata(window,'flag',false);

    while(true)

      sceneImage = camera.read; 
      sceneImage = rgb2gray(sceneImage);

      scenePoints = detector.detect(sceneImage);
      sceneFeatures = extractor.compute(sceneImage,scenePoints);

      m = matcher.match(sceneFeatures);

      %{
      %Comments in
      img_no = m.imgIdx;
      img_no = img_no(1);

      %I am planning to do this based on the fact that
      %on a perfect match imgIdx a 1xN will be filled
      %with the index of the training  
      %example 1,2 or 3 

      objPoints = train(img_no+1).points;
      boxImage = train(img_no+1).img;

      ptsScene = cat(1,scenePoints([m.queryIdx]+1).pt);
      ptsScene = num2cell(ptsScene,2);

      ptsObj = cat(1,objPoints([m.trainIdx]+1).pt);
      ptsObj = num2cell(ptsObj,2);

      %This is where the problem starts here, assuming the 
      %above is correct , Matlab yells this at me 
      %index exceeds matrix dimensions.

      end [H,inliers] = cv.findHomography(ptsScene,ptsObj,'Method','Ransac');
      m = m(inliers);

      imgMatches = cv.drawMatches(sceneImage,scenePoints,boxImage,boxPoints,m,...
       'NotDrawSinglePoints',true);
      imshow(imgMatches);

     %Comment out
     %}

      flag = getappdata(window,'flag');
      if isempty(flag) || flag, break; end
      pause(0.0001);

end

Now the issue here is that imgIdx is a 1xN matrix , and it contains the index of different training indices, which is obvious. And only on a perfect match is the matrix imgIdx is completely filled with the matched image index. So, how do I use this matrix to pick the right image index. Also in these two lines, I get the error of index exceeding matrix dimension.

ptsObj = cat(1,objPoints([m.trainIdx]+1).pt);
ptsObj = num2cell(ptsObj,2);

This is obvious since while debugging I saw clearly that the size of m.trainIdx is greater than objPoints, i.e I am accessing points which I should not, hence index exceeds There is scant documentation on use of imgIdx , so anybody who has knowledge on this subject, I need help. These are the images I used.

Image1

Image2

Image3

1st update after @Amro's response:

With the ratio of min distance to distance at 3.6 , I get the following response.

For 3.6

With the ratio of min distance to distance at 1.6 , I get the following response.

For 1.6

Solution

I think it is easier to explain with code, so here it goes :)

%% init
detector = cv.FeatureDetector('ORB');
extractor = cv.DescriptorExtractor('ORB');
matcher = cv.DescriptorMatcher('BruteForce-Hamming');

urls = {
    'http://i.imgur.com/8Pz4M9q.jpg?1'
    'http://i.imgur.com/1aZj0MI.png?1'
    'http://i.imgur.com/pYepuzd.jpg?1'
};

N = numel(urls);
train = struct('img',cell(N,1), 'pts',cell(N,1), 'feat',cell(N,1));

%% training
for i=1:N
    % read image
    train(i).img = imread(urls{i});
    if ~ismatrix(train(i).img)
        train(i).img = rgb2gray(train(i).img);
    end

    % extract keypoints and compute features
    train(i).pts = detector.detect(train(i).img);
    train(i).feat = extractor.compute(train(i).img, train(i).pts);

    % add to training set to match against
    matcher.add(train(i).feat);
end
% build index
matcher.train();

%% testing
% lets create a distorted query image from one of the training images
% (rotation+shear transformations)
t = -pi/3;    % -60 degrees angle
tform = [cos(t) -sin(t) 0; 0.5*sin(t) cos(t) 0; 0 0 1];
img = imwarp(train(3).img, affine2d(tform));    % try all three images here!

% detect fetures in query image
pts = detector.detect(img);
feat = extractor.compute(img, pts);

% match against training images
m = matcher.match(feat);

% keep only good matches
%hist([m.distance])
m = m([m.distance] < 3.6*min([m.distance]));

% sort by distances, and keep at most the first/best 200 matches
[~,ord] = sort([m.distance]);
m = m(ord);
m = m(1:min(200,numel(m)));

% naive classification (majority vote)
tabulate([m.imgIdx])    % how many matches each training image received
idx = mode([m.imgIdx]);

% matches with keypoints belonging to chosen training image
mm = m([m.imgIdx] == idx);

% estimate homography (used to locate object in query image)
ptsQuery = num2cell(cat(1, pts([mm.queryIdx]+1).pt), 2);
ptsTrain = num2cell(cat(1, train(idx+1).pts([mm.trainIdx]+1).pt), 2);
[H,inliers] = cv.findHomography(ptsTrain, ptsQuery, 'Method','Ransac');

% show final matches
imgMatches = cv.drawMatches(img, pts, ...
    train(idx+1).img, train(idx+1).pts, ...
    mm(logical(inliers)), 'NotDrawSinglePoints',true);

% apply the homography to the corner points of the training image
[h,w] = size(train(idx+1).img);
corners = permute([0 0; w 0; w h; 0 h], [3 1 2]);
p = cv.perspectiveTransform(corners, H);
p = permute(p, [2 3 1]);

% show where the training object is located in the query image
opts = {'Color',[0 255 0], 'Thickness',4};
imgMatches = cv.line(imgMatches, p(1,:), p(2,:), opts{:});
imgMatches = cv.line(imgMatches, p(2,:), p(3,:), opts{:});
imgMatches = cv.line(imgMatches, p(3,:), p(4,:), opts{:});
imgMatches = cv.line(imgMatches, p(4,:), p(1,:), opts{:});
imshow(imgMatches)

The result:

object_detection

Note that since you did not post any testing images (in your code you are taking input from the webcam), I created one by distorting one the training images, and using it as a query image. I am using functions from certain MATLAB toolboxes (imwarp and such), but those are non-essential to the demo and you could replace them with equivalent OpenCV ones...

I must say that this approach is not the most robust one.. Consider using other techniques such as the bag-of-word model, which OpenCV already implements.