I'm running this code to identify the number of clusters I need with a K prototype clustering and I'm getting this error
PlotnineError: "Could not evaluate the 'x' mapping: 'Cluster' (original error: name 'Cluster' is not defined)"
# Choose optimal K using Elbow method
cost = []
for cluster in range(1, 10):
try:
kprototype = KPrototypes(n_jobs = -1, n_clusters = cluster, init = 'Huang', random_state = 0)
kprototype.fit_predict(dfMatrix, categorical = catColumnsPos)
cost.append(kprototype.cost_)
print('Cluster initiation: {}'.format(cluster))
except:
break
# Converting the results into a dataframe and plotting them
a = {'Cluster':range(1, 6), 'Cost':cost}
df_cost = pd.DataFrame.from_dict(a, orient='index')
df_cost.transpose()
# Data viz
plotnine.options.figure_size = (8, 4.8)
(
ggplot(data = df_cost)+
geom_line(aes(x = 'Cluster',
y = 'Cost'))+
geom_point(aes(x = 'Cluster',
y = 'Cost'))+
geom_label(aes(x = 'Cluster',
y = 'Cost',
label = 'Cluster'),
size = 10,
nudge_y = 1000) +
labs(title = 'Optimal number of cluster with Elbow Method')+
xlab('Number of Clusters k')+
ylab('Cost')+
theme_minimal()
)
You have an oversight in your data transformation code.
This line
df_cost.transpose()
should be
df_cost = df_cost.transpose()