Search code examples
pythonscipysparse-matrix

Converting row of sparse matrix to dense leaks memory


In my program I have two scipy.sparse.csr_matrix. One has only one row and the other is actually large. In each iteration of the program, I add and subtract rows of the matrices. Eventually I need to use .todense() on the single row matrix. I noticed that just calling this function makes the used memory grow without apparent reason. I need to do many iterations and cannot afford to have this memory leak.

I was able to write a simple program that illustrates my problem:

import numpy as np
from scipy import sparse

a = sparse.csr_matrix(np.matrix(np.random.random((1, 250))))
b = sparse.csr_matrix(np.matrix(np.random.random((250, 250))))

for i in range(10000000):
    a = a - b[4]
    c = a.todense()
    print(i)

So when I run the above program I see that at after a certain point the memory used does not stop growing.


Solution

  • This is a bug, will be fixed in scipy 0.18.0. There is no workaround.