I would like to seek support pertaining to the correlation matrix for 2 different dataset and generating it to a heatmap.
Listed below is the sample data:
Expression | PR | Metrics |
---|---|---|
Engagement | 0.33 | 0.70 |
Excitement | 0.33 | 0.15 |
Focus | 0.33 | 0.36 |
Interest | 0.67 | 0.47 |
Relaxation | 0.55 | 0.20 |
Stress | 0.44 | 0.40 |
As these data are not imported from a csv file (Due to the need for modification in future), it is created via a df. And the values are converted to float using astype(float)
The way that I have created the df and converting the types are provided here.
data = {
'Expression':['Engagement', 'Excitement', 'Focus','Interest','Relaxation','Stress'],
'PR': ['0.33','0.33','0.33','0.67','0.55','0.44'],
'Metrics': ['0.70','0.15','0.36','0.47','0.20','0.40']
}
df['PR']=df['PR'].astype(float) #Converts object dtype to float
df['Emotiv Metrics']=df['Emotiv Metrics'].astype(float) #Converts object dtype to float
After which, if I were to use df.corr()
, it will only provide the correlation result as shown:
PR Metrics
PR 1.000000 -0.048189
Metrics -0.048189 1.000000
However, what I would like to generate is a correlation matrix that shows the correlation between EACH expression from the PR and Metrics, as to what is provided in the snipped image, inclusive of the Metrics and PR.
How should I go about it in this case then?
Or if there's any error pertaining to the above code, please do point out as well.
Use DataFrame.dot
with transpose DataFrame
with seaborn.heatmap
:
import seaborn as sb
df1 = df.set_index('Expression')[['PR','Metrics']]
df = df1.dot(df1.T).rename_axis(index='name1', columns='name2')
print (df)
name2 Engagement Excitement Focus Interest Relaxation Stress
name1
Engagement 0.5989 0.2139 0.3609 0.5501 0.3215 0.4252
Excitement 0.2139 0.1314 0.1629 0.2916 0.2115 0.2052
Focus 0.3609 0.1629 0.2385 0.3903 0.2535 0.2892
Interest 0.5501 0.2916 0.3903 0.6698 0.4625 0.4828
Relaxation 0.3215 0.2115 0.2535 0.4625 0.3425 0.3220
Stress 0.4252 0.2052 0.2892 0.4828 0.3220 0.3536
sb.heatmap(df, annot=True)