Search code examples
numpynumpy-memmapmemmap

Memory error when sorting a numpy memmap array


I have a numpy-memmap matrix S of size 12 GB. And I'm trying to argsort each row. To do that I have defined another memmap array first_k to save the result. The problem is that a memory error occurs.

Here is the code:

first_k = np.memmap('first_k', dtype='float32', mode='w+', shape=S.shape)
first_k[:] = np.memmap.argsort(S, axis=1)

Any possible solutions? I am thinking to process it in slices ...

Thanks in advance


Solution

  • Finally, I process the memmap in slices. Here is the code.

    N = S.shape[0]
    first_k = np.memmap('first_k', dtype='float32', mode='w+', shape=(N, N))        
    del first_k
    
    step = 1000
    for row in np.arange(0, N, step):
        size = min(step, N-row)
        first_k = np.memmap('first_k',
                            dtype='float32',
                            mode='r+',
                            shape=(size, N),
                            offset=4*N*row)
    
        first_k[:] = np.memmap.argsort(S[row:row+size], axis=1)
        del first_k