Search code examples
mysqlsqlrecursive-querygaps-and-islandsdate-arithmetic

MySQL how to fill missing dates in range?


I have a table with 2 columns, date and score. It has at most 30 entries, for each of the last 30 days one.

date      score
-----------------
1.8.2010  19
2.8.2010  21
4.8.2010  14
7.8.2010  10
10.8.2010 14

My problem is that some dates are missing - I want to see:

date      score
-----------------
1.8.2010  19
2.8.2010  21
3.8.2010  0
4.8.2010  14
5.8.2010  0
6.8.2010  0
7.8.2010  10
...

What I need from the single query is to get: 19,21,9,14,0,0,10,0,0,14... That means that the missing dates are filled with 0.

I know how to get all the values and in server side language iterating through dates and missing the blanks. But is this possible to do in mysql, so that I sort the result by date and get the missing pieces.

EDIT: In this table there is another column named UserID, so I have 30.000 users and some of them have the score in this table. I delete the dates every day if date < 30 days ago because I need last 30 days score for each user. The reason is I am making a graph of the user activity over the last 30 days and to plot a chart I need the 30 values separated by comma. So I can say in query get me the USERID=10203 activity and the query would get me the 30 scores, one for each of the last 30 days. I hope I am more clear now.


Solution

  • MySQL doesn't have recursive functionality, so you're left with using the NUMBERS table trick -

    1. Create a table that only holds incrementing numbers - easy to do using an auto_increment:

      DROP TABLE IF EXISTS `example`.`numbers`;
      CREATE TABLE  `example`.`numbers` (
        `id` int(10) unsigned NOT NULL auto_increment,
         PRIMARY KEY  (`id`)
      ) ENGINE=InnoDB DEFAULT CHARSET=latin1;
      
    2. Populate the table using:

      INSERT INTO `example`.`numbers`
        ( `id` )
      VALUES
        ( NULL )
      

      ...for as many values as you need.

    3. Use DATE_ADD to construct a list of dates, increasing the days based on the NUMBERS.id value. Replace "2010-06-06" and "2010-06-14" with your respective start and end dates (but use the same format, YYYY-MM-DD) -

      SELECT `x`.*
        FROM (SELECT DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY)
                FROM `numbers` `n`
               WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` -1 DAY) <= '2010-06-14' ) x
      
    4. LEFT JOIN onto your table of data based on the time portion:

         SELECT `x`.`ts` AS `timestamp`,
                COALESCE(`y`.`score`, 0) AS `cnt`
           FROM (SELECT DATE_FORMAT(DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY), '%m/%d/%Y') AS `ts`
                   FROM `numbers` `n`
                  WHERE DATE_ADD('2010-06-06', INTERVAL `n`.`id` - 1 DAY) <= '2010-06-14') x
      LEFT JOIN TABLE `y` ON STR_TO_DATE(`y`.`date`, '%d.%m.%Y') = `x`.`ts`
      

    If you want to maintain the date format, use the DATE_FORMAT function:

    DATE_FORMAT(`x`.`ts`, '%d.%m.%Y') AS `timestamp`