Search code examples
pythonnumpymasked-array

numpy masked_array mask changes type


I am a bit surprised by the fact that np.ma.masked_equal or masked_values does not create a mask of False if the value is not in the array, but instead a scalar.

Example :

y = np.arange(10)
yy = np.ma.masked_equal(y,0)

yields a masked array withe the mask being an array of 10 False values, while

y = np.arange(1,10) 
yy = np.ma.masked_equal(y,0)

yields a masked array with the mask set to the scalar False. As a result, given that in my code I do not know beforehand whether the mask match any entry in the array, I am forced to check explicitly:

yy=np.ma.masked_values(y,0)
if np.isscalar(yy.mask):
    yy.mask=np.zeros(y.shape,dtype=bool)

This seems an overwork to me. What is the reason for this behavior, and is there a way to avoid it?


Solution

  • You can simply create the MaskedArray youself:

    yy = np.ma.MaskedArray(y, mask=(y==0))
    

    It seems that NumPy tries to minimize the memory requirements and speed up the computations in case the MaskedArray is unmasked.

    numpy.ma.nomask

    Value indicating that a masked array has no invalid entry. nomask is used internally to speed up computations when the mask is not needed.

    If you check:

    >>> np.ma.nomask
    False
    

    So the single False represents "no mask". So you could also check maskedarr.mask is np.ma.nomask (it's a garantueed constant):

    yy = some_operation_that_creates_a_masked_array
    if yy.mask is np.ma.nomask:
        yy.mask = np.zeros(yy.shape, dtype=bool)
    

    That carries a bit more context then np.isscalar.