Search code examples
sqlmysqlstatisticsmedian

Simple way to calculate median with MySQL


What's the simplest (and hopefully not too slow) way to calculate the median with MySQL? I've used AVG(x) for finding the mean, but I'm having a hard time finding a simple way of calculating the median. For now, I'm returning all the rows to PHP, doing a sort, and then picking the middle row, but surely there must be some simple way of doing it in a single MySQL query.

Example data:

id | val
--------
 1    4
 2    7
 3    2
 4    2
 5    9
 6    8
 7    3

Sorting on val gives 2 2 3 4 7 8 9, so the median should be 4, versus SELECT AVG(val) which == 5.


Solution

  • In MariaDB / MySQL:

    SELECT AVG(dd.val) as median_val
    FROM (
    SELECT d.val, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
      FROM data d, (SELECT @rownum:=0) r
      WHERE d.val is NOT NULL
      -- put some where clause here
      ORDER BY d.val
    ) as dd
    WHERE dd.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) );
    

    Steve Cohen points out, that after the first pass, @rownum will contain the total number of rows. This can be used to determine the median, so no second pass or join is needed.

    Also AVG(dd.val) and dd.row_number IN(...) is used to correctly produce a median when there are an even number of records. Reasoning:

    SELECT FLOOR((3+1)/2),FLOOR((3+2)/2); -- when total_rows is 3, avg rows 2 and 2
    SELECT FLOOR((4+1)/2),FLOOR((4+2)/2); -- when total_rows is 4, avg rows 2 and 3
    

    Finally, MariaDB 10.3.3+ contains a MEDIAN function