Search code examples
sqlmysqlaggregate-functionsdatetime-generation

MYSQL select count of rows that fall in a month for every month of the year


With the table:

id   date_from    date_to
---------------------------
1    2010-01-01   2010-03-01
2    2010-02-07   2010-05-01
3    2010-07-05   2010-07-10

I am trying to return a result which has one row for each month of the year that lets me know how many rows were active in that period.

So for the above I would want the result

2010-01-01 1
2010-02-01 2
2010-03-01 2
2010-04-01 1
2010-05-01 1
2010-06-01 0
2010-07-01 1
2010-08-01 0
2010-09-01 0
etc...

I've tried grouping my MONTH(date_from) , but that doesn't return the rows with no results


Solution

  • MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -

    1. Create a table that only holds incrementing numbers - easy to do using an auto_increment:

      DROP TABLE IF EXISTS `example`.`numbers`;
      CREATE TABLE  `example`.`numbers` (
        `id` int(10) unsigned NOT NULL auto_increment,
         PRIMARY KEY  (`id`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
      
    2. Populate the table using:

      INSERT INTO NUMBERS
        (id)
      VALUES
        (NULL)
      

      ...for as many values as you need.

    3. Use DATE_ADD to construct a list of times, increasing the months based on the NUMBERS.id value:

      SELECT x.*
        FROM (SELECT DATE_FORMAT(DATE_ADD('2010-01-01', INTERVAL n.id - 1 MONTH), '%Y-%m-%d') 
                FROM numbers n) x
      
    4. LEFT JOIN onto your table of data based on the time portion:

         SELECT x.ts AS timestamp,
                SUM(CASE WHEN x.ts BETWEEN y.date_from AND y.date_to THEN 1 ELSE 0 END) AS cnt
           FROM (SELECT DATE_FORMAT(DATE_ADD('2010-01-01', INTERVAL n.id - 1 MONTH), '%Y-%m-%d') AS ts
                   FROM numbers n) x
      LEFT JOIN TABLE y ON x.ts BETWEEN y.date_from AND y.date_to
       GROUP BY x.ts