Search code examples
sql-serversql-server-2012group-concatgaps-and-islands

query that groups by different columns including time difference


I have this simple query that returns a weeks worth of soccer games.

select game_group, game_name, game_date, game_StartTime, game_EndTime
from Games
where game_date between '10/1/2018' and '10/7/2018'

With results like this:

 Junior   | blue_red     | 10/1 | 8:00  | 8:30
 Junior   | blue_red     | 10/1 | 8:30  | 9:00
 Junior   | blue_green   | 10/1 | 9:00  | 10:30
 Freshman | brown_purple | 10/1 | 12:00 | 12:30
 Freshman | black_white  | 10/1 | 12:30 | 1:00
 Freshman | yellow_pink  | 10/3 | 3:45  | 5:00
 Senior   | blue_orange  | 10/4 | 7:00  | 7:30
 Sophmore | pink_red     | 10/4 | 7:30  | 8:30
 Sophmore | green_purple | 10/4 | 8:30  | 9:00

Is there a way to manipulate the query, so that it groups the games by date and times? The logic would be, combine games that are in the same group, on the same date and within 30 minutes or less of each other...so the grouping factors would be game_group, game_date, and the 30 minute difference between game_StartTime and game_EndTime...

like this:

  • schedule # 1 is: Junior, blue_red;blue_red;blue_green, 10/1, 8:00 to 10:30
  • schedule # 2 is: Freshman, brown_purple;black_white; 10/1, 12:00 to 1:00
  • schedule # 3 is: Freshman, yellow_pink, 10/3, 3:45 to 5:00
  • schedule # 4 is: Senior, blue_orange, 10/4, 7:00 to 7:30
  • schedule # 5 is: Sophmore, pink_red; green_purple, 10/4, 7:30 to 9:00

Solution

  • What you actually need to do is to use a combination of approaches to common problems.

    You first need to use a "Gaps and Islands" approach to get a CTE or derived table that looks like this (using abbreviations and partial sample data to keep it short):

    Group   Game     Date   Start   End    GroupSequence
    Fr      B-P      10/1   12:00   12:30  1
    Fr      B-W      10/1   12:30    1:00  1
    Fr      Y-P      10/1    3:45    5:00  2
    

    GroupSequence is an artificial column that you calculate in the CTE using a "Gaps and Islands" solution. All the games in the same group, on the same day, that are less than 30 minutes from the previous end to the subsequent start will share the same GroupSequence.

    Then you select from that CTE, grouping by Group, Date, and GroupSequence, using the GROUP_CONCAT technique to build the Games column, and building your Time Range columns with the MIN(Start) and the MAX(End).