I am trying to train a k-means model and currently in phase of checking correlation within my feature vectors.
When I run a pearson correlation against my feature vector I am unable to see results for all of my features.
The code I am running is:
val cor = Correlation.corr(scoringDf, "features")
cor.show(false)
The correlation runs fine but when i try to see the results using show method (as Correlation.corr returns a Datafame object) the results are displayed as
|1.0 0.18047211468479446 0.08002566273874058 ... (5 total)
0.18047211468479446 1.0 0.02926796076983553 ...
0.08002566273874058 0.02926796076983553 1.0 ...
0.30256416877032244 0.15974389490583188 0.054692657400425136 ...
0.3408783412055776 0.13008391583866225 0.04241296238931376 ...|
Is there a way to see the hidden columns?
I have also tried the following code but results are same.
val Row(coeff1: Matrix) = Correlation.corr(scoringDf, "features").head
println(s"Pearson correlation matrix:\n $coeff1")
Edit:
here is the schema for cor dataframe
root
|-- pearson(features): matrix (nullable = false)
Finally I am able to get the output the way I want. Changed my code to look like this
val Row(coeff1: Matrix) = Correlation.corr(scoringDf, "features").head
println(s"Pearson correlation matrix:\n " + coeff1.toString(10, 100000))
The output is displayed as shown below:
Pearson correlation matrix:
1.0 0.1804721146847944 0.08002566273874055 0.3025641687703226 0.34087834120557725
0.1804721146847944 1.0 0.02926796076983553 0.15974389490583193 0.13008391583866233
0.08002566273874055 0.02926796076983553 1.0 0.05469265740042514 0.042412962389313726
0.3025641687703226 0.15974389490583193 0.05469265740042514 1.0 0.241118490251708
0.34087834120557725 0.13008391583866233 0.042412962389313726 0.241118490251708 1.0