Search code examples
mysqlsqlsubquerymedian

MySQL median subquery in select


This is what I have at the moment: http://sqlfiddle.com/#!9/30a97c/10

Schema:

CREATE TABLE IF NOT EXISTS `test` (
  `t_id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `t_city` varchar(64) NOT NULL,
  `t_category` enum('cat1','cat2') NOT NULL,
  `t_type` enum('type1','type2') NOT NULL,
  `t_num` int(8) NOT NULL,
  PRIMARY KEY (`t_id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1;

INSERT INTO `test` (`t_id`, `t_city`, `t_category`, `t_type`, `t_num`) VALUES
(1, 'New York', 'cat1', 'type1', 1056),
(2, 'New York', 'cat1', 'type1', 3756),
(3, 'London', 'cat1', 'type2', 3456),
(4, 'New York', 'cat1', 'type2', 5756),
(5, 'London', 'cat1', 'type2', 3777),
(6, 'New York', 'cat2', 'type1', 3756),
(7, 'New York', 'cat2', 'type1', 3756),
(8, 'London', 'cat2', 'type1', 3777),
(9, 'New York', 'cat2', 'type2', 4556),
(10, 'Berlin', 'cat1', 'type2', 1556),
(11, 'Berlin', 'cat2', 'type2', 9756),
(12, 'London', 'cat1', 'type2', 1756);

Query:

SELECT "cat1" as st_category, "type2" as st_type, t_city as st_name, 
(
SELECT (ROUND(AVG(dd.t_num) / 10) * 10) as median
FROM (
SELECT d.t_num, @rownum:=@rownum+1 as `row_number`, @total_rows:=@rownum
  FROM test as d, (SELECT @rownum:=0) as r
  WHERE d.t_num is NOT NULL
  AND `t_category` = "cat1" AND `t_type` = "type2" AND `t_city` = "XXXXX"  
  ORDER BY d.t_num
) as dd
WHERE dd.row_number IN ( FLOOR((@total_rows+1)/2), FLOOR((@total_rows+2)/2) )
) as st_median,
COUNT(t_num) as st_count
FROM `test` WHERE `t_category` = "cat1" AND `t_type` = "type2"
AND t_city in ("London", "New York")
GROUP BY t_city
ORDER BY st_name ASC

I don't know what to write in place of XXX (in the subquery) so that the median values of "t_num" in the output lines are not NULL. No median function in my MySQL.

OUTPUT NOW:

+-------------+---------+----------+-----------+----------+
| st_category | st_type |  st_name | st_median | st_count |
+-------------+---------+----------+-----------+----------+
|        cat1 |   type2 |   London |    (null) |        3 |
|        cat1 |   type2 | New York |    (null) |        1 |
+-------------+---------+----------+-----------+----------+

REQUIRED OUTPUT:

+-------------+---------+----------+-----------+----------+
| st_category | st_type |  st_name | st_median | st_count |
+-------------+---------+----------+-----------+----------+
|        cat1 |   type2 |   London |      3460 |        3 |
|        cat1 |   type2 | New York |      5760 |        1 |
+-------------+---------+----------+-----------+----------+

Solution

  • Thank you all for your help! This is the right solution:

    SET SESSION GROUP_CONCAT_MAX_LEN = 1000000;
    SELECT `t_category` AS st_category,
           `t_type`     AS st_type,
           `t_city`     AS st_name,
           CASE ( COUNT(*) % 2 )
             WHEN 1 THEN 
                ROUND(SUBSTRING_INDEX(SUBSTRING_INDEX(GROUP_CONCAT(`t_num` 
                    ORDER BY `t_num` SEPARATOR ','), ',', ( COUNT(*) + 1 ) / 2), ',', -1)
                / 10) * 10
             ELSE 
                ROUND((SUBSTRING_INDEX( SUBSTRING_INDEX( GROUP_CONCAT(`t_num`
                    ORDER BY `t_num` SEPARATOR ','), ',', COUNT(*) / 2), ',', -1)
                     + SUBSTRING_INDEX( SUBSTRING_INDEX( GROUP_CONCAT(`t_num`
                    ORDER BY `t_num` SEPARATOR ','), ',', (COUNT(*) + 1) / 2), ',', -1) ) / 2
                / 10) * 10
           END               st_median,
           COUNT(`t_num`) AS st_count
    FROM   `test`
    WHERE  `t_category` = "cat1"
           AND `t_type` = "type2"
           AND `t_city` IN ( "London", "New York" )
    GROUP  BY `t_city`
    ORDER  BY st_name ASC; 
    

    SQL Fiddle: http://sqlfiddle.com/#!9/30a97c/13

    The GROUP_CONCAT_MAX_LEN setting is only required if you are working a lot of data. Maximum value is 18,446,744,073,709,551,615 on 64-bit platforms. On 32-bit platforms: 4,294,967,295.

    https://dev.mysql.com/doc/refman/5.7/en/server-system-variables.html#sysvar_group_concat_max_len