Search code examples
pythonlabelcluster-analysisk-meansattributeerror

AttributeError: 'numpy.ndarray' object has no attribute 'cost_'


I am trying to conduct Kprototype clustering algorithm. When I both run the model and try to do the cost graph as follows, I always get a 'no attribute' error for labels_ and cost_ functions. I checked the examples on several web sites, but there is no difference. What can I do? Thank you for your help.

1)

from kmodes.kmodes import KModes

from kmodes.kprototypes import KPrototypes

kproto1 = KPrototypes(n_clusters=15, init='Cao').fit_predict(data,categorical = [23])
labels= kproto1.labels_ 

**AttributeError: 'numpy.ndarray' object has no attribute 'label_'**
cost = []
range_cluster=[5,8,10,15,20,25,30,35,40,45,50,55,70,85,100]

for num_clusters in range_cluster:
    kproto = KPrototypes(n_clusters=num_clusters, init='Cao').fit_predict(data, categorical=[23])
    cost.append(kproto.cost_)

plt.plot(cost)

Solution

  • According to the source code, there are 2 ways to achieve this :
    fit_predict method will return a tuple of labels, cost. So to get your labels, you should :

    kproto1_result = KPrototypes(n_clusters=15, init='Cao').fit_predict(data,categorical = [23]) 
    labels= kproto1[0]
    

    or the 2nd method is just using the fit method :

    kproto1 = KPrototypes(n_clusters=15, init='Cao').fit(data,categorical = [23]) 
    labels = kproto1.labels_