I have two dense matrices A
and B
, and each of them has a size fo 3e5x100
. Another sparse binary matrix, C
, with size 3e5x3e5
. I want to find the following quantity: C ∘ (AB')
, where ∘
is Hadamard product (i.e., element wise) and B'
is the transpose of B
. Explicitly calculating AB'
will ask for crazy amount of memory (~500GB). Since the end result won't need the whole AB'
, it is sufficient to only calculate the multiplication A_iB_j'
where C_ij != 0
, where A_i
is the column i
of matrix A
and C_ij
is the element at location (i,j)
of the matrix C
. A suggested approach would be like the algorithm below:
result = numpy.initalize_sparse_matrix(shape = C.shape)
while True:
(i,j) = C_ij.pop_nonzero_index() #prototype function returns the nonzero index and then points to the next nonzero index
if (i,j) is empty:
break
result(i,j) = A_iB_j'
This algorithm however takes too much time. Is there anyway to improve it using LAPACK/BLAS
algorithms? I am coding in Python so I think numpy
can be more human friendly wrapper for LAPACK/BLAS
.
You can do this computation using the following, assuming C
is stored as a scipy.sparse
matrix:
C = C.tocoo()
result_data = C.data * (A[C.row] * B[C.col]).sum(1)
result = sparse.coo_matrix((result_data, (row, col)), shape=C.shape)
Here we show that the result matches the naive algorithm for some smaller inputs:
import numpy as np
from scipy import sparse
N = 300
M = 10
def make_C(N, nnz=1000):
data = np.random.rand(nnz)
row = np.random.randint(0, N, nnz)
col = np.random.randint(0, N, nnz)
return sparse.coo_matrix((data, (row, col)), shape=(N, N))
A = np.random.rand(N, M)
B = np.random.rand(N, M)
C = make_C(N)
def f_naive(C, A, B):
return C.multiply(np.dot(A, B.T))
def f_efficient(C, A, B):
C = C.tocoo()
result_data = C.data * (A[C.row] * B[C.col]).sum(1)
return sparse.coo_matrix((result_data, (C.row, C.col)), shape=C.shape)
np.allclose(
f_naive(C, A, B).toarray(),
f_efficient(C, A, B).toarray()
)
# True
And here we see that it works for the full input size:
N = 300000
M = 100
A = np.random.rand(N, M)
B = np.random.rand(N, M)
C = make_C(N)
out = f_efficient(C, A, B)
print(out.shape)
# (300000, 300000)
print(out.nnz)
# 1000