Converting OpenCV SURF features to float32 arrays in Python

I extract the features with the compute() function and add them to a list. I then try to convert all the features to float32 using NumPy so that they can be used with OpenCV for classification. The error I am getting is:

ValueError: setting an array element with a sequence.

Not really sure what I can do about this. I am following a book and doing the same steps except they use HOS to extract the features. I am extracting the features and getting back matrices of inconsistent sizes and am not sure how I can make them all equal. Related code (which might have minor syntax errors cause I truncated it from the original code):

    def get_SURF_feature_vector(area_of_interest, surf):
            # Detect the key points in the image
            key_points = surf.detect(area_of_interest);
            # Create array of zeros with the same shape and type as a given array
            image_key_points = np.zeros_like(area_of_interest);
            # Draw key points on the image
            image_key_points = cv2.drawKeypoints(area_of_interest, key_points, image_key_points, flags=cv2.DRAW_MATCHES_FLAGS_DRAW_RICH_KEYPOINTS)
            # Create feature discriptors
            key_points, feature_descriptors = surf.compute(area_of_interest, key_points);
            # Plot Image and descriptors
            # plt.imshow(image_key_points);
            # Return computed feature description matrix
            return feature_descriptors;

    for x in range(0, len(data)):
            feature_list.append(get_SURF_feature_vector(area_of_interest[x], surf));
list_of_features = np.array(list_of_features, dtype = np.float32);

Solution

The error isn't specific to OpenCV at all, just numpy.

Your list feature_list contains different length arrays. You can't make a 2d array out of arrays of different sizes.

For e.g. you can reproduce the error really simply:

>>> np.array([[1], [2, 3]], dtype=np.float32)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: setting an array element with a sequence.

I'm assuming what you're expecting from the operation is to input [1], [1, 2] and be returned np.array([1, 2, 3]), i.e., concatenation (actually this is not what OP wants, see the comments under this post). You can use the np.hstack() or np.vstack() for those operations, just depending on the shape of your input. You can use np.concatenate() too with the axis argument but the stacking operations are more explicit for 2D/3D arrays.

>>> a = np.array([1], dtype=np.float32)
>>> b = np.array([2, 3, 4], dtype=np.float32)
>>> np.hstack([a, b])
array([1., 2., 3., 4.], dtype=float32)

Descriptors are listed vertically though, so they should be stacked vertically, not horizontally as above. Thus you can simply do:

list_of_features = np.vstack(list_of_features)

You don't need to specify dtype=np.float32 as the descriptors are np.float32 by default (also, vstack doesn't have a dtype argument so you'd have to convert it after the stacking operation).

If you instead want an 3D array, then you need the same number of features across all images so that it's an evenly filled 3D array. You could just fill up your feature vectors with placeholder values, like 0s or np.nan so that they're all the same length, and then you can group them together as you did originally.

>>> des1 = np.random.rand(500, 64).astype(np.float32)
>>> des2 = np.random.rand(200, 64).astype(np.float32)
>>> des3 = np.random.rand(400, 64).astype(np.float32)
>>> feature_descriptors = [des1, des2, des3]

So here each image's feature descriptors have a different number of features. You can find the largest one:

>>> max_des_length = max([len(d) for d in feature_descriptors])
>>> max_des_length
500

You can use np.pad() to pad each feature array with however many more values it needs to be the same size as your maximum size descriptor set.

Now this is a little unnecessary to do it all in one line, but whatever.

>>> feature_descriptors = [np.pad(d, ((0, (max_des_length - len(d))), (0, 0)), 'constant', constant_values=np.nan) for d in feature_descriptors]

The annoying argument here ((0, (max_des_length - len(d))), (0, 0)) is just saying to pad with 0 elements on the top, max_des_length - len(des) elements on the bottom, 0 on the left, 0 on the right.

As you can see here, I'm adding np.nan values to the arrays. If you left out the constant_values argument it defaults to 0. Lastly all you have to do is cast as a numpy array:

>>> feature_descriptors = np.array(feature_descriptors)
>>> feature_descriptors.shape
(3, 500, 64)