Search code examples
pythonnumpymatrixsparse-matrixlapack

Multiplication of two huge dense matrices Hadamard-multiplied by a sparse matrix


I have two dense matrices A and B, and each of them has a size fo 3e5x100. Another sparse binary matrix, C, with size 3e5x3e5. I want to find the following quantity: C ∘ (AB'), where is Hadamard product (i.e., element wise) and B' is the transpose of B. Explicitly calculating AB' will ask for crazy amount of memory (~500GB). Since the end result won't need the whole AB', it is sufficient to only calculate the multiplication A_iB_j' where C_ij != 0, where A_i is the column i of matrix A and C_ij is the element at location (i,j) of the matrix C. A suggested approach would be like the algorithm below:

result = numpy.initalize_sparse_matrix(shape = C.shape)
while True:
 (i,j) = C_ij.pop_nonzero_index() #prototype function returns the nonzero index and then points to the next nonzero index
 if (i,j) is empty:
   break
 result(i,j) = A_iB_j'

This algorithm however takes too much time. Is there anyway to improve it using LAPACK/BLAS algorithms? I am coding in Python so I think numpy can be more human friendly wrapper for LAPACK/BLAS.


Solution

  • You can do this computation using the following, assuming C is stored as a scipy.sparse matrix:

    C = C.tocoo()
    result_data = C.data * (A[C.row] * B[C.col]).sum(1)
    result = sparse.coo_matrix((result_data, (row, col)), shape=C.shape)
    

    Here we show that the result matches the naive algorithm for some smaller inputs:

    import numpy as np
    from scipy import sparse
    
    N = 300
    M = 10
    
    def make_C(N, nnz=1000):
      data = np.random.rand(nnz)
      row = np.random.randint(0, N, nnz)
      col = np.random.randint(0, N, nnz)
      return sparse.coo_matrix((data, (row, col)), shape=(N, N))
    
    
    A = np.random.rand(N, M)
    B = np.random.rand(N, M)
    C = make_C(N)
    
    def f_naive(C, A, B):
      return C.multiply(np.dot(A, B.T))
    
    def f_efficient(C, A, B):
      C = C.tocoo()
      result_data = C.data * (A[C.row] * B[C.col]).sum(1)
      return sparse.coo_matrix((result_data, (C.row, C.col)), shape=C.shape)
    
    np.allclose(
        f_naive(C, A, B).toarray(),
        f_efficient(C, A, B).toarray()
    )
    # True
    

    And here we see that it works for the full input size:

    N = 300000
    M = 100
    
    A = np.random.rand(N, M)
    B = np.random.rand(N, M)
    C = make_C(N)
    
    out = f_efficient(C, A, B)
    
    print(out.shape)
    # (300000, 300000)
    
    print(out.nnz)
    # 1000