matlab image-processing computer-vision sift vlfeat

Dsift (vl_feat) and Matlab

I have a (naïve probably) question, I just wanted to clarify this part. So, when I take a dsift on one image I generally get an 128xn matrix. Thing is, that n value, is not always the same across different images. Say image 1 gets an 128x10 matrix, while image 2 gets a 128x18 matrix. I am not quite sure why is this happening.

I think that each column of 128dimension represents a single image feature or a single patch detected from the image. So in the case of 128x18, we have extracted 18 patches and described them with 128 values each. If this is correct, why cant we have a fixed numbers of patches per image, say 20 patches, so every time our matrixes would be 128x20.

Cheers!

Solution

This is because the number of reliable features that are detected per image change. Just because you detect 10 features in one image does not mean that you will be able to detect the same number of features in the other image. What does matter is how close one feature from one image matches with another.

What you can do (if you like) is extract the, say, 10 most reliable features that are matched the best between the two images, if you want to have something constant. Choose a number that is less than or equal to the minimum of the number of patches detected between the two. For example, supposing you detect 50 features in one image, and 35 features in another image. After, when you try and match the features together, this results in... say... 20 best matched points. You can choose the best 10, or 15, or even all of the points (20) and proceed from there.

I'm going to show you some example code to illustrate my point above, but bear in mind that I will be using vl_sift and not vl_dsift. The reason why is because I want to show you visual results with minimal pre- and post-processing. Should you choose to use vl_dsift, you'll need to do a bit of work before and after you compute the features by dsift if you want to visualize the same results. If you want to see the code to do that, you can check out the vl_dsift help page here: http://www.vlfeat.org/matlab/vl_dsift.html. Either way, the idea about choosing the most reliable features applies to both sift and dsift.

For example, supposing that Ia and Ib are uint8 grayscale images of the same object or scene. You can first detect features via SIFT, then match the keypoints.

[fa, da] = vl_sift(im2single(Ia));
[fb, db] = vl_sift(im2single(Ib));
[matches, scores] = vl_ubcmatch(da, db);

matches contains a 2 x N matrix, where the first row and second row of each column denotes which feature index in the first image (first row) matched best with the second image (second row).

Once you do this, sort the scores in ascending order. Lower scores mean better matches as the default matching method between two features is the Euclidean / L₂ norm. As such:

numBestPoints = 10;
[~,indices] = sort(scores);

%// Get the numBestPoints best matched features
bestMatches = matches(:,indices(1:numBestPoints));

This should then return the 10 best matches between the two images. FWIW, your understanding about how the features are represented in vl_feat is spot on. These are stored in da and db. Each column represents a descriptor of a particular patch in the image, and it is a histogram of 128 entries, so there are 128 rows per feature.

Now, as an added bonus, if you want to display how each feature from one image matches to another image, you can do the following:

%// Spawn a new figure and show the two images side by side
figure;
imagesc(cat(2, Ia, Ib));

%// Extract the (x,y) co-ordinates of each best matched feature
xa = fa(1,bestMatches(1,:));

%// CAUTION - Note that we offset the x co-ordinates of the
%// second image by the width of the first image, as the second
%// image is now beside the first image.
xb = fb(1,bestMatches(2,:)) + size(Ia,2);
ya = fa(2,bestMatches(1,:));
yb = fb(2,bestMatches(2,:));

%// Draw lines between each feature
hold on;
h = line([xa; xb], [ya; yb]);
set(h,'linewidth', 1, 'color', 'b');

%// Use VL_FEAT method to show the actual features
%// themselves on top of the lines
vl_plotframe(fa(:,bestMatches(1,:)));
fb2 = fb; %// Make a copy so we don't mutate the original
fb2(1,:) = fb2(1,:) + size(Ia,2); %// Remember to offset like we did before
vl_plotframe(fb2(:,bestMatches(2,:)));
axis image off; %// Take out the axes for better display