Search code examples
sqlamazon-redshiftgaps-and-islands

Redshift: Find MAX in list disregarding non-incremental numbers


I work for a sports film analysis company. We have teams with unique team IDs and I would like to find the number of consecutive weeks they have uploaded film to our site moving backwards from today. Each upload also has its own row in a separate table that I can join on teamid and has a unique date of when it was uploaded. So far I put together a simple query that pulls each unique DATEDIFF(week) value and groups on teamid.

Select teamid, MAX(weekdiff)
(Select teamid, DATEDIFF(week, dateuploaded, GETDATE()) as weekdiff 
from leroy_events
group by teamid, weekdiff)

What I am given is a list of teamIDs and unique weekly date differences. I would like to then find the max for each teamID without breaking an increment of 1. For example, if my data set is:

Team     datediff
11453    0
11453    1
11453    2
11453    5
11453    7
11453    13

I would like the max value for team: 11453 to be 2.

Any ideas would be awesome.


Solution

  • I have simplified your example assuming that I already have a table with weekdiff column. That would be what you're doing with DATEDIFF to calculate it.

    First, I'm using LAG() window function to assign previous value (in ordered set) of a weekdiff to the current row.

    Then, using a WHERE condition I'm retrieving max(weekdiff) value that has a previous value which is current_value - 1 for consecutive weekdiffs.

    Data:

    create table leroy_events ( teamid int, weekdiff int);
    insert into leroy_events values (11453,0),(11453,1),(11453,2),(11453,5),(11453,7),(11453,13);
    

    Code:

    WITH initial_data AS (
    Select 
        teamid, 
        weekdiff,
        lag(weekdiff,1) over (partition by teamid order by weekdiff) as lag_weekdiff
    from
        leroy_events
    )
    SELECT
      teamid,
      max(weekdiff) AS max_weekdiff_consecutive
    FROM
      initial_data
    WHERE weekdiff = lag_weekdiff + 1 -- this insures retrieving max() without breaking your consecutive increment
    GROUP BY 1
    

    SQLFiddle with your sample data to see how this code works.

    Result:

    teamid  max_weekdiff_consecutive
    11453   2