k-means clustering on ORB features

I have to do image classification based on k-means clustering of ORB features. If I understood correctly from documentation, a feature is essentially a keypoint and a descriptor. I'm not sure what I should put as X when I do kmeans.fit(): in the example here it says that X_digits is a numpy array of Bunch objects, so I'm assuming that I should group the keypoint and the corresponding descriptor toghether and use that as the X in kmeans.fit(x). Here's the code:

@dataclass
class BOVWFeaturizer(ImgFeaturizerABC):
    number_of_features_per_image: int = 100
    vocabulary_size: int = 8
    def fit(self, images: np.ndarray, labels=None):
        orb = cv.ORB_create(self.number_of_features_per_image)
        keypoints_orb = orb.detect(images, None)
        keypoints_orb, descriptors = orb.compute(images, keypoints_orb)
        kmeans = cluster.KMeans(n_clusters=2, random_state=0)
        """
        features = ## something that groups keypoints and descriptor
        """
        kmeans.fit(features, labels)

        return self

I have no prior knowledge of machine learning or computer vision, so sorry if this is a really basic question

Edit: here's what I've tried:

features = [[kp, desc] for kp, desc in zip(keypoints_orb, descriptors)]
features = [(kp, desc) for kp, desc in zip(keypoints_orb, descriptors)]

In both cases, the output was:

TypeError: float() argument must be a string or a number, not 'cv2.KeyPoint'

I've tried converting it to an ndarray:

features = np.ndarray([(kp, desc) for kp, desc in zip(keypoints_orb, descriptors)])

Output:

ValueError: maximum supported dimension for an ndarray is 32, found 100

Am i supposed to compress both values (keypoint and descriptor) into an 1d ndarray?

Solution

You must not add keypoints to the classifier. Classification will be done only based on the descriptors, So just feed the classifier with descriptors as the input features.