Search code examples
mysqlconflictdate-range

MySQL - Exclude conflicting date ranges


Imagine this list of date ranges

> SELECT * FROM range_table;
+----+------------+------------+
| id | start      | end        |
+----+------------+------------+
|  1 | 2014-01-01 | 2014-01-15 | /* -- Not conflicting */
|  2 | 2014-01-15 | 2014-01-16 | /* -- Conflicting     */
|  3 | 2014-01-15 | 2014-01-20 | /* |                  */
|  4 | 2014-01-15 | 2014-01-19 | /* /                  */
|  5 | 2014-01-24 | 2014-01-26 | /* -- Conflicting     */
|  6 | 2014-01-21 | 2014-01-25 | /* /                  */
+----+------------+------------+

I'm trying to remove the conflicting ranges and keep only the first occurrence of each conflicting set.

Here is what I should have at the end:

+----+------------+------------+
| id | start      | end        |
+----+------------+------------+
|  1 | 2014-01-01 | 2014-01-15 |
|  2 | 2014-01-15 | 2014-01-16 |
|  5 | 2014-01-24 | 2014-01-26 |
+----+------------+------------+

Here is the fiddle


Solution

  • Here is my possible solution:

    SELECT `id`, `start`, `end`
    FROM (
      SELECT
        r1.id
      , r1.start
      , r1.end
      , COUNT(DISTINCT r2.id) AS "conflicts"
      , MD5(GROUP_CONCAT(DISTINCT r2.id ORDER BY r2.id)) AS "group_chksum"
      FROM range_table AS r1
      LEFT JOIN range_table AS r2
        ON (r1.end > r2.start AND r1.start < r2.end)
      GROUP BY r1.id
    ) AS tmp
    GROUP BY group_chksum
    ;
    

    The idea is to group the result by set of ranges and benefit of the tolerance of MySQL to take the first of each.

    I'm sure there is simpler