Search code examples
pythonnumpymedianmasked-array

How to get single median in numpy masked array with even number of entires


I have a numpy masked nd-array. I need to find the median along a specific axis. For some cases, I end up having even number of elements, in which case numpy.ma.median gives average of the middle two elements. However, I don't want the average. I want one of the median elements. Any one of the two is fine. How do I get this?

MWE:

>>> import numpy
>>> data=numpy.arange(-5,10).reshape(3,5)
>>> mdata=numpy.ma.masked_where(data<=0,data)
>>> numpy.ma.median(mdata, axis=0)
masked_array(data=[5.0, 3.5, 4.5, 5.5, 6.5],
             mask=[False, False, False, False, False],
       fill_value=1e+20)

As you can see, it is averaging (1 and 6) and providing fractional values (3.5). I want any one of 1 or 6.


Solution

    • numpy.percentile(array, 50) gives median value.
    • numpy.percentile has an option to specify interpolation to nearest.
    • However this function is not available in numpy.ma module.
    • The trick used in this answer can be used here.

    The idea is to fill invalid values with nan and use numpy.nanpercentile() with nearest interpolation.

    >>> mdata1 = numpy.ma.filled(mdata.astype('float'), numpy.nan)
    >>> numpy.nanpercentile(mdata1, 50, axis=0, interpolation='nearest')
    array([5., 1., 2., 3., 4.])