Search code examples
mysqlmedian

mysql median value per month


I am doing some queries from a database to make graphs, and a graph that may be needed is the median of a set of values per month. I know how to get the median of the whole set of our data, but for some reason I can't get GROUP BY MONTH(..) to separate the months so it returns the median per month.

Here is the data that I have and I'd like for every month to get the median value for each month regardless of build type.

'Development', 1013.0164, 'June'
'Development', 1170.8999, 'July'
'Development', 671.2837, 'August'
'Flash Assets', 2961.3832, 'June'
'Flash Assets', 6662.2335, 'July'
'Flash Assets', 3902.5000, 'August'
'Release', 54.5499, 'June'
'Release', 62.4832, 'July'
'Release', 398.8500, 'August'
'Repackage', 1360.0834, 'June'
'Repackage', 6286.8505, 'July'
'Repackage', 1274.7833, 'August'
'Component', 16378.0161, 'June'
'Component', 6063.5482, 'July'
'Component', 23663.2496, 'August'
'Source Diff', 1503.8834, 'June'
'Source Diff', 1051.4500, 'July'
'Source Diff', 73.7002, 'August'

I'd like to end up with this,

June, XXXX
July, XXXX
August, XXXX

Thanks.

EDIT: Current Query

This is the query that I am using right now to get the overall median, I am not sure how to translate this to getting it per month.

 SELECT t.Data AS 'Median' FROM
 (SELECT CEIL(COUNT(*)/2.0) as 'Middle', s.Data as 'Data' FROM
 (SELECT bt.name as 'Labels', 
 SUM(TIME_TO_SEC(TIMEDIFF(bs.eventtime, b.submittime))/60.0) 
 AS 'Data', MONTHNAME(b.submittime) FROM builds b 
 JOIN buildstatuses bs ON bs.buildid = b.id 
 JOIN buildtypes bt ON bt.id = b.buildtype 
 WHERE MONTH(b.submittime) BETWEEN MONTH(CURDATE())-2 AND MONTH(CURDATE()) 
 AND bs.status LIKE 'Started HANDLER' 
 GROUP BY b.buildtype, MONTH(b.submittime) ORDER BY 'Data' ) s )t;

Solution

  • if your value is always in float ... you can try this --

    assume Data is the value column

    select mth,
    substring_index
    (
      substring_index
      (
        group_concat(Data order by Data), ',', 
        count(*)/2+1
      ), 
      ',', -1
    )
    from your_tables
    group by mth;
    
    • group_concat will sort the values for month in ascending order
    • then use of first substring_index will return from first value to middle value +1 more position
    • then next substring_index will help you to ge tthe right most value (which is the median)