I have a square pyspark RowMatrix
that looks like this:
>>> row_mat.numRows()
100
>>> row_mat.numCols()
100
>>> row_mat.rows.first()
SparseVector(100, {0: 0.0, 1: 0.0018, 2: 0.1562, 3: 0.0342...})
I would like to run pyspark.ml.feature.PCA
, but its fit()
method only takes in a DataFrame
. Is there a way to convert this RowMatrix
into a DataFrame
?
Or is there a better way to do it?
Use:
row_mat.rows.map(lambda x: (x, )).toDF()