Search code examples

Inverting the "" operation

I want to create an array from a compressed masked array and a corresponding mask. Its easier to explain this with an example:

>>>,2)), mask = [[True,True],[False,False]])
>>> y=x.compressed()
>>> y
array([ 2,  3])

Now I want to create an array in the same shape as x where the masked values get a standard value (for example -1) and the rest is filled up with a given array. It should work like this:

>>> z = decompress(y, mask=[[True,True],[False,False]], default=-1)
>>> z
array([[-1, -1],
       [ 2,  3]])

The question is: Is there any method like "decompress", or do i need to code it myself? In Fortran this is done by the methods "pack" and "unpack". Thanks for any suggestions.


  • While I've answered a number of ma questions I'm by no means an expert with it. But I'll explore the issue

    Let's generalize your a array a bit:

    In [934]:,3)), mask = [[True,True,False],[False,False,True]])
    In [935]: x
    masked_array(data =
     [[-- -- 2]
     [3 4 --]],
                 mask =
     [[ True  True False]
     [False False  True]],
           fill_value = 999999)
    In [936]: y=x.compressed()
    In [937]: y
    Out[937]: array([2, 3, 4])

    y has no information about x except a subset of values. Note it is 1d

    x stores its values in 2 arrays (actually these are properties that access underlying ._data, ._mask attributes):

    In [938]:
    array([[0, 1, 2],
           [3, 4, 5]])
    In [939]: x.mask
    array([[ True,  True, False],
           [False, False,  True]], dtype=bool)

    My guess is that to de-compress we need to make a empty masked array with the correct dtype, shape and mask, and copy the values of y into its data. But what values should be put into the masked elements of data?

    Or another way to put the problem - is it possible to copy values from y back onto x?

    A possible solution is to copy the new values to x[~x.mask]:

    In [957]: z=2*y
    In [958]: z
    Out[958]: array([4, 6, 8])
    In [959]: x[~x.mask]=z
    In [960]: x
    masked_array(data =
     [[-- -- 4]
     [6 8 --]],
                 mask =
     [[ True  True False]
     [False False  True]],
           fill_value = 999999)
    In [961]:
    array([[0, 1, 4],
           [6, 8, 5]])

    Or to make a new array

    In [975]: w=np.zeros_like(x)
    In [976]: w[~w.mask]=y
    In [977]: w
    masked_array(data =
     [[-- -- 2]
     [3 4 --]],
                 mask =
     [[ True  True False]
     [False False  True]],
           fill_value = 999999)
    In [978]:
    array([[0, 0, 2],
           [3, 4, 0]])

    Another approach is to make a regular array, full with the invalid values, copy y in like this, and turn the whole thing into a masked array. It's possible that there is a masked array constructor that lets you specify the valid values only along with the mask. But I'd have to dig into the docs for that.


    Another sequence of operations that will do this, using for set values

    In [1011]: w=np.empty_like(x)
    In [1014]:,w.mask,999)
    In [1015]:,~w.mask,[1,2,3])
    In [1016]: w
    masked_array(data =
     [[-- -- 1]
     [2 3 --]],
                 mask =
     [[ True  True False]
     [False False  True]],
           fill_value = 999999)
    In [1017]:
    array([[999, 999,   1],
           [  2,   3, 999]])


    Look at 
    class _MaskedBinaryOperation:

    This class is used to implement masked ufunc. It evaluates the ufunc at valid cells (non-masked) and returns a new masked array with the valid ones, leaving the masked values unchanged (from the original)

    For example with a simple masked array, +1 does not changed the masked value.

    In [1109]:[1,0,2],0)
    In [1110]: z
    masked_array(data = [1 -- 2],
                 mask = [False  True False],
           fill_value = 0)
    In [1111]:
    Out[1111]: array([1, 0, 2])
    In [1112]: z+1
    masked_array(data = [2 -- 3],
                 mask = [False  True False],
           fill_value = 0)
    In [1113]:
    Out[1113]: array([2, 0, 3])
    In [1114]: z.compressed()+1
    Out[1114]: array([2, 3])

    _MaskedUnaryOperation might be simpler to follow, since it only has to work with 1 masked array.

    Example, regular log has problems with the masked 0 value:

    In [1115]: z.log()
    /usr/local/bin/ipython3:1: RuntimeWarning: divide by zero encountered in log
    masked_array(data = [0.0 -- 0.6931471805599453],
                 mask = [False  True False],
           fill_value = 0)

    but the masked log skips the masked entry:

    In [1117]:
    masked_array(data = [0.0 -- 0.6931471805599453],
                 mask = [False  True False],
           fill_value = 0)
    In [1118]:
    Out[1118]: array([ 0.        ,  0.        ,  0.69314718])

    oops - _MaskedUnaryOperation might not be that useful. It evaluates the ufunc at all values, with a errstate context to block warnings. It then uses the mask to copy masked values on to the result (np.copyto(result, d, where=m)).