Search code examples
pythonapache-sparkpysparkapache-spark-mllib

How to display the result of a BlockMatrix multiplication in PySpark?


This sounds like a simple question, but I am not able to figure out how to display the contents of a pyspark BlockMatrix to the console. What methods should I call on it to actually see my result?


Solution

  • You can call toLocalMatrix(). I used an example matrix from the Spark MLLib Python API docs to illustrate:

    mat = mat1.toLocalMatrix()
    
    # which returns a DenseMatrix
    # DenseMatrix(6, 2, [1.0, 2.0, 3.0, 7.0, 8.0, 9.0, 4.0, 5.0, 6.0, 10.0, 11.0, 12.0], 0)
    
    # and could be further converted to a numpy array using `.toArray()`:
    np_mat = mat.toArray()
    # array([[ 1.,  4.],
    #        [ 2.,  5.],
    #        [ 3.,  6.],
    #        [ 7., 10.],
    #        [ 8., 11.],
    #        [ 9., 12.]])