Search code examples
pythonalgorithmcluster-analysisface-recognitiondlib

Why don't Face Clustering Algorithms use Distance matrices rather than clustering algorithms?


I was reading dlib's face clustering code and noticed that the process is like so:

  1. Convert faces to vector using trained network
  2. Use Chinese whisper clustering algorithm to compute groups based on distance

Chinese whisper clustering can take a pretty long time when trying to cluster a large number (>10,000) images.

In this pyimagesearch article the author uses DBSCAN, another clustering algorithm to group a number of images by person.

Since the vectors generated by the neural net can be used to calculate the similarity between two faces, wouldn't it be better to just calculate a euclidean distance matrix, then search for all the values that meet a confidence threshold (eg x < 0.3 for 70% confidence)?

Why use a clustering algorithm at all when you can just compare every face with every other face to determine which ones are the same person? both DBSCAN and chinese whisper clustering take a much longer time than calculating a distance matrix. With my dataset of 30,000 images the times are:

C-whisper - 5 minutes

distance matrix + search - 10-20 seconds


Solution

  • DBSCAN actually takes only marginally longer than computing a distance matrix (when implemented right, 99% of the computation is the distance computations) and with indexing can sometimes be much faster because it does not need every pairwise distance if the index can prune computations.

    But you can't just "read off" clusters from the distance matrix. The data there may be contradictory: the face detector may consider A and B similar and B similar to C, but A and C dissimilar! What do you do then? Clustering algorithms try to solve exactly such situations. For example single-Link, and to a lesser extend DBSCAN, would make A and C the same cluster, whereas complete linkage would decide upon either AB or BC.