Search code examples
opencvimage-processingclassificationsift

SIFT clustering converting sift features (128 dimensional vector) into a vocabulary


How to cluster the extracted SIFT descriptors. The aim of doing clustering is to use it for classification purpose.


Solution

  • To cluster , convert N*128 dimension(N is the number of descriptor from each image) into a array of M*128 dimension (M number of descriptor from all images). and perform cluster on this data.

    eg:

    def dict2numpy(dict):
        nkeys = len(dict)
        array = zeros((nkeys * PRE_ALLOCATION_BUFFER, 128))
        pivot = 0
        for key in dict.keys():
            value = dict[key]
            nelements = value.shape[0]
            while pivot + nelements > array.shape[0]:
                padding = zeros_like(array)
                array = vstack((array, padding))
            array[pivot:pivot + nelements] = value
            pivot += nelements
        array = resize(array, (pivot, 128))
        return array
    
    all_features_array = dict2numpy(all_features)
    nfeatures = all_features_array.shape[0]
    nclusters = 100
    codebook, distortion = vq.kmeans(all_features_array,
                                             nclusters)