In Spark's MLlib
, why are the computational interfaces provided for different distributed matrices inconsistent? For example, RowMatrix
and IndexRowMatrix
provide the computeSVD
method, while CoordinateMatrix
and BlockMatrix
do not.
Why is this?
This is because SVD algorithm needs a row-oriented (or column-oriented) matrix format.
If CoordinateMatrix
and BlockMatrix
exposed a computeSVD
method, under the hood it would need to trigger a (potentially expensive) conversion.