Tried searching, it did not turn up anything relevant. Let's say we have a series with even number of numbers, and we want to calculate its median:
pd.Series([4, 6, 8, 10]).median()
Since we have an even number of elements, there's no element that is exactly in the middle, so instead the method performs the calculation: (6 + 8) / 2 = 7. However, for my purposes it is very important that the median is a number that already exists in the Series, it can't be something calculated from scratch. So I'd rather pick either 6 or 8 than use 7.
One of the possible solutions is to detect the fact that there is an even number of elements and, in such cases, add another element that is guaranteed to be the largest or the smallest, and then just delete it after I get the median. But this solution seems rather clumsy even for a case with one Series. And if we're dealing with a SeriesGroupBy object instead, where such median has to be calculated for each group separately, I can't even begin to imagine how to implement that.
It looks like there's no parameter in the median()
method that makes it select one of the two nearest elements instead of dividing, and I can't find any alternative to median()
method that can do that either. Is implementing my own median function my only choice?
Instead of using median you should probably use the quantile option (default is median, the 0.5 quantile), and set interpolation to higher, lower, or nearest.
E.g.
>>> pd.Series([4, 6, 8, 10]).quantile(q=0.5, interpolation='nearest')
8
>>> pd.Series([4, 6, 8, 10]).quantile(q=0.5, interpolation='higher')
8
>>> pd.Series([4, 6, 8, 10]).quantile(q=0.5, interpolation='lower')
6