I have an SQL table with one column of integer values and I would like to find the median of those values without any sorting. I first tried finding the numbers in the table that are greater than and less than each value:
SELECT DISTINCT d1.v,
(SELECT COUNT(d2.v)
FROM Values d2
WHERE d2.v > d1.v
) AS greater_than,
(SELECT COUNT(d2.v)
FROM Values d2
WHERE d2.v < d1.v
) AS less_than
FROM Values d1
I'm not sure how to proceed. I believe I want the values in the above table where greater_than
and less_than
are both equal to num_entries / 2
, but that would only work for a table with an even number of entries. What's a good way to get the median from what I have above?
You could do it like this:
SELECT (MIN(v)+MAX(v))/2
FROM (
SELECT CASE WHEN
LEAST((SELECT COUNT(*) FROM tbl WHERE v <= d1.v),
(SELECT COUNT(*) FROM tbl WHERE v >= d1.v))
>= (SELECT COUNT(*)/2 FROM tbl)
THEN v
END as v
FROM tbl d1
) as sub
The inner query is guaranteed to return 1 or 2 distinct non-null
values among potentially many null
values. The non-null
values may repeat, but by taking the minimum and maximum those two values can be used to calculate the median.
NB: Don't name your table Values
: it is a reserved word.