I am currently working on clustering categorical attributes that come from a bank marketing dataset from Kaggle. I have created the three clusters with kmodes:
Output: cluster_df
Now I want to visualize each row of a cluster as a projection or point so that I get some kind of image:
I am having a hard time with this. I don't get a Euclidean distance with categorized data, right? That makes no sense. Is there then no possibility to create this desired visualization?
The best way to visualize clusters is to use PCA. You can use PCA to reduce the multi-dimensional data into 2 dimensions so that you can plot and hopefully understand the data better. To use it see the following code:
from sklearn.decomposition import PCA
pca = PCA(n_components=2)
principalComponents = pca.fit_transform(x)
principalDf = pd.DataFrame(data = principalComponents
, columns = ['principal component 1', 'principal component 2'])
where x is the fitted and transformed data on your cluster. Now u can easily visualize your clustered data since it's 2 dimensional.