Search code examples
pythonnumpymultidimensional-arraymaskarray-merge

Efficient way of merging two numpy masked arrays


I have two numpy masked arrays which I want to merge. I'm using the following code:

import numpy as np

a = np.zeros((10000, 10000), dtype=np.int16)
a[:5000, :5000] = 1
am = np.ma.masked_equal(a, 0)

b = np.zeros((10000, 10000), dtype=np.int16)
b[2500:7500, 2500:7500] = 2
bm = np.ma.masked_equal(b, 0)

arr = np.ma.array(np.dstack((am, bm)), mask=np.dstack((am.mask, bm.mask)))
arr = np.prod(arr, axis=2)
plt.imshow(arr)

Plot of the resulting merged array

The problem is that the np.prod() operation is very slow (4 seconds in my computer). Is there an alternative way of getting a merged array in a more efficient way?


Solution

  • Instead of your last two lines using dstack() and prod(), try this:

    arr = np.ma.array(am.filled(1) * bm.filled(1), mask=(am.mask * bm.mask))
    

    Now you don't need prod() at all, and you avoid allocating the 3D array entirely.