How to use SIFT features/descriptors as input for SVM training?

I want to classify MRI images of a brain tumor into benign and malignant using C++. I am using SIFT features and the paper I am following clustered them using kmeans before training the SVM classifier. What I don't understand is why is there a need to do that? From what I know, kmeans only clusters the features; it doesn't change the size of the input.

I have read that possible ways are BoW and histogram. In the histogram approach, It just counts the # of features in each cluster right? I don't think that will provide the information I'll need for classifying benign and malignant tumors because they can be both small and big. In BoW approach, I didn't understand this link.

Basically, I don't know what to do with my SIFT features to use it as input for SVM. Do I really have to create a dictionary of some sort? I'm begging you, please enlighten me. Thank you very much!

Solution

I'm not too familiar with OpenCV or SIFT features, but this should be general enough to be useful to all programming languages. I will also be describing only the BoW approach below.

Let's assume we had N images. For each image i, we have F number of features, and each feature had D dimensions. We can put all the features into an array feats, so that it looks like this:

[1, 2, ..., D]
[..., ..., ..., D]
[N*F, ..., ..., D]

Each row of feats is a feature, with D dimensions, and we have a total of N*F features.

In k-means, we take all these features and group them into k clusters. Therefore, every single feature is assigned to a single cluster. Most k-means functions typically return a matrix C of size k x D, which represents the centroids of the clusters. This matrix C is the "codebook" or "dictionary" of the k-means algorithm. Some also return a vector of size N*F which shows which cluster each feature is assigned to (in OpenCv, this is represented by the labels variable in this link: http://www.developerstation.org/2012/01/kmeans-clustering-in-opencv-with-c.html).

Since we already have the assignments of all the features, each image i has F features, which can be simply represented by the clusters they belong to. For example, if the original image was represented as

[1, 2, ..., D]
[..., ..., ..., D]
[F, ..., ..., D]

then the image can also be represented simply as a vector:

[1] % Assignment of feature 1
[...]
[F] % Assignment of feature F

Therefore, you can take this vector and form a histogram h of the clusters that are represented. This histogram is the feature vector for the image, which you can later use in the SVM.

P.S. If you need any further clarification and/or an example, let me know!