I have these data set as shown below:
which belong to Pokemon dataset https://elitedatascience.com/wp-content/uploads/2022/07/Pokemon.csv
I want to plot the heatmap as shown below:
# Calculate correlations
corr = stats_df.corr()
# Heatmap
plt.figure(figsize=(9,8))
sns.heatmap(corr)
But I get this error below; how can I solve it?
To compute the (Pearson) correlation you need to have numeric data.
Try:
df = pd.read_csv('Pokemon.csv', encoding='latin1', index_col='#')
corr = df.select_dtypes('number').drop(columns=['Total', 'Generation']).corr()
sns.heatmap(data=corr)
plt.tight_layout()
plt.show()
Output:
>>> corr
HP Attack Defense Sp. Atk Sp. Def Speed
HP 1.000000 0.422386 0.239622 0.362380 0.378718 0.175952
Attack 0.422386 1.000000 0.438687 0.396362 0.263990 0.381240
Defense 0.239622 0.438687 1.000000 0.223549 0.510747 0.015227
Sp. Atk 0.362380 0.396362 0.223549 1.000000 0.506121 0.473018
Sp. Def 0.378718 0.263990 0.510747 0.506121 1.000000 0.259133
Speed 0.175952 0.381240 0.015227 0.473018 0.259133 1.000000