Search code examples
pythonarraysnumpycomplex-numbersmemory-efficient

How to efficiently work with large complex numpy arrays?


For my research I am working with large numpy arrays consisting of complex data.

arr = np.empty((15000, 25400), dtype='complex128')
np.save('array.npy'), arr)

When stored they are about 3 GB each. Loading these arrays is a time consuming process, which made me wonder if there are ways to speed this process up

One of the things I was thinking of was splitting the array into its complex and real part:

arr_real = arr.real 
arr_im = arr.imag

and saving each part separately. However, this didn't seem to improve processing speed significantly. There is some documentation about working with large arrays, but I haven't found much information on working with complex data. Are there smart(er) ways to work with large complex arrays?


Solution

  • If you only need parts of the array in memory, you can load it using memory mapping:

    arr = np.load('array.npy', mmap_mode='r')
    

    From the docs:

    A memory-mapped array is kept on disk. However, it can be accessed and sliced like any ndarray. Memory mapping is especially useful for accessing small fragments of large files without reading the entire file into memory.