Search code examples
pythonpytorchmediantorch

Pytorch median - is it bug or am I using it wrong


I am trying to get median of each row of 2D torch.tensor. But the result is not what I expect when compared to working with standard array or numpy

import torch
import numpy as np
from statistics import median

print(torch.__version__)
>>> 0.4.1

y = [[1, 2, 3, 5, 9, 1],[1, 2, 3, 5, 9, 1]]
median(y[0])
>>> 2.5

np.median(y,axis=1)
>>> array([2.5, 2.5])

yt = torch.tensor(y,dtype=torch.float32)
yt.median(1)[0]
>>> tensor([2., 2.])

Solution

  • Looks like this is the intended behaviour of Torch as mentioned in this issue

    https://github.com/pytorch/pytorch/issues/1837
    https://github.com/torch/torch7/pull/182

    The reasoning as mentioned in the link above

    Median returns 'middle' element in case of odd-many elements, otherwise one-before-middle element (could also do the other convention to take mean of the two around-the-middle elements, but that would be twice more expensive, so I decided for this one).