I'm currently trying to use the KMeans Clustering functionality provided by the elki library.
This is what I came up with:
double[][] dblArray = new double[100][10] // 100 10-dimensional data points
//populate array...
KMeansInitialization<NumberVector<Double>> kinit = new FirstKInitialMeans<>();
KMeansLloyd<NumberVector<Double>, DoubleDistance> kmeans
= new KMeansLloyd<NumberVector<Double>, DoubleDistance>(EuclideanDistanceFunction.STATIC, K, KMEANSMAXITER, kinit);
DatabaseConnection dbc = new ArrayAdapterDatabaseConnection(dblArray));
Database d = new StaticArrayDatabase(dbc, null);
kmeans.run(d);
Elki gives me:
de.lmu.ifi.dbs.elki.data.type.NoSupportedDataTypeException: No data type found satisfying: NumberVector,field AND NumberVector Available types: at de.lmu.ifi.dbs.elki.database.AbstractDatabase.getRelation(Unknown Source) at de.lmu.ifi.dbs.elki.algorithm.AbstractAlgorithm.run(Unknown Source)
Don't forget to initialize your database:
d.initialize();
at this point, data will be fetched from the database connections, and indexes will be built.
If you forget to initialize your database, it will remain empty.