Search code examples
pythonnumpyfloating-pointprecisionmedian

Numpy median precision issues at scale


In my experiment, I have a large 2D np.ndarray X of type float64 of dimensions 25x431080. I want to calculate the element-wise median across the 0-axis to get an array of dimensions 1x431080. Assume that I distort a row of the original array such that the median should not be affected, e.g., assign it to a value out of the range of the original elements. My problem is that the median computation won't return the exact same array as before.

I am wondering whether this is a typical precision issue. Is there is any way around it perhaps with another type or function?

I am attaching here a randomly generated example s.t. one can reproduce the issue

import numpy as np
x = np.random.uniform(-1,1,(25,431080))
med1 = np.median(x, axis = 0)
x[13,:] = -100*np.ones(x.shape[1]) # distort one row to -100
med2 = np.median(x, axis = 0)
np.array_equal(med1, med2) # returns False

Note: re-computation of the median on the same array gives exactly the same result so there is no precision loss or any other change across different runs of the program.


Solution

  • I am not sure that your assumption is correct. Why should changing a value of the array to -100 not also change the median?

    while True:
        x1 = np.round(np.random.uniform(-1, 1, 10), 2)
        x2 = x1.copy()
        x2[3] = -100
    
        m1 = np.median(x1)
        m2 = np.median(x2)
        
        if m1 != m2:
            print(x1)
            print(x2)
            print(m1, m2)
            break
    

    Or maybe even simpler: an example array [1, 2, 3] with the median 2. Changing one of the initial values to -100 in general also changes the median. But sometimes you are lucky. If you change a value smaller than the median with -100 the median stays the same, but if you exchange a value larger or equal to the median, the median changes.

    x1 = [1,    2,    3] -> 2
    x2 = [-100, 2,    3] -> 2
    x3 = [1, -100,    3] -> 1
    x4 = [1,    2, -100] -> 1