Search code examples
mysqlmedian

SQL - Calculating median without sorting


I have an SQL table with one column of integer values and I would like to find the median of those values without any sorting. I first tried finding the numbers in the table that are greater than and less than each value:

SELECT DISTINCT d1.v,       
  (SELECT COUNT(d2.v)
   FROM Values d2  
   WHERE d2.v > d1.v
   ) AS greater_than,
   (SELECT COUNT(d2.v)
   FROM Values d2  
   WHERE d2.v < d1.v
   ) AS less_than
FROM Values d1

enter image description here

I'm not sure how to proceed. I believe I want the values in the above table where greater_than and less_than are both equal to num_entries / 2, but that would only work for a table with an even number of entries. What's a good way to get the median from what I have above?


Solution

  • You could do it like this:

    SELECT (MIN(v)+MAX(v))/2
    FROM   (
            SELECT  CASE WHEN
                            LEAST((SELECT COUNT(*) FROM tbl WHERE v <= d1.v),
                                  (SELECT COUNT(*) FROM tbl WHERE v >= d1.v))
                               >= (SELECT COUNT(*)/2 FROM tbl) 
                         THEN v
                    END as v
            FROM    tbl d1
           ) as sub
    

    The inner query is guaranteed to return 1 or 2 distinct non-null values among potentially many null values. The non-null values may repeat, but by taking the minimum and maximum those two values can be used to calculate the median.

    NB: Don't name your table Values: it is a reserved word.