Search code examples
pythonscikit-learnblaze

Blaze with Scikit Learn K-Means


I am trying to fit Blaze data object to scikit kmeans function.

from blaze import *
from sklearn.cluster import KMeans
data_numeric = Data('data.csv')
data_cluster = KMeans(n_clusters=5)
data_cluster.fit(data_numeric)

Data Sample:

A  B  C
1  32 34
5  57 92
89 67 21

Its throwing error :

enter image description here

I have been able to do it with Pandas Dataframe. Any way to feed blaze object to this function ?


Solution

  • I think you need to convert your pandas dataframe into an numpy array before you fit.

    from blaze import *
    import numpy
    
    from sklearn.cluster import KMeans
    data_numeric = numpy.array(data('data.csv'))
    data_cluster = KMeans(n_clusters=5)
    data_cluster.fit(data_numeric)