Search code examples
machine-learningcluster-analysiswekaunsupervised-learning

Using Weka for unsupervised clustering


I have data in the following format:

X,Y,sim(X,Y)

That is, a list of triples, with:

  • X, the name of an object;
  • Y, the name of another object;
  • sim(X,Y), a real number expressing the distance between the two objects.

Now, I want to apply some unsupervised clustering algorithm on this data. I had Weka in mind but I would gladly consider alternatives too.


Solution

  • There are plenty of algorithms that can work with similarity matrices:

    • Hierarchical Linkage Clustering
    • DBSCAN
    • OPTICS
    • Affinity Propagation
    • Spectral Clustering

    just to name a few. As for software, I prefer ELKI, it has much much more clustering choices.