Search code examples
sqlpostgresqldata-analysiswindow-functions

How to classify groups based on changes from the previous period in PostgreSQL?


Good day to everyone! I have a table with some teams and their rating for the period. Data looks like this:

team date rating
team_1 2022-09 7.5
team_1 2022-12 8
team_1 2023-03 8.5
team_2 2022-09 8
team_2 2022-12 7
team_2 2023-03 8
team_3 2022-09 8
team_3 2022-12 8
team_3 2023-03 7
team_4 2022-09 9
team_4 2022-12 8
team_4 2023-03 7

As we can see, rating of the team may increase or decrease comparing to the last period, or it can change for two consecutive periods. Thus, team may be classified to one of following groups:

  1. Increasing for two consecutive periods
  2. Increased compared to the last period
  3. Decreased compared to the last period
  4. Decreasing for two consecutive periods

In this sample, the result should be like this:

team status
team_1 Increasing for two consecutive periods
team_2 Increased compared to the last period
team_3 Decreased compared to the last period
team_4 Decreasing for two consecutive periods

My question is how can I get such result? I suspect there should be some window function, but I'm not sure how exactly to use it.


Solution

  • Yes, window functions, like here:

    select team, case sgn when  1 then 'increasing ' 
                          when -1 then 'decreasing '
                          else 'stable '
                 end ||'for '||cnt||' period(s)' status
    from (
      select team, dt, sgn, rn, 
             count(1) over (partition by team, sgn order by dt) cnt
      from ( 
        select team, dt, rating, 
               sign(rating - lag(rating) over (partition by team order by dt)) sgn, 
               row_number() over (partition by team order by dt desc) rn 
          from teams ) t1 ) t2
    where rn = 1
    

    dbfiddle demo

    Compare ratings for each row using lag(), assign sign to get values -1, 0 or 1. Then count these consecutive values for each team and take last row, found with row_number(). Use case ... when to present values as you wanted.


    Edit:

    In the above query I assumed, that there can be maximum three enntries for each team, because these were possible team statuses. In case there are more entries gap-n-islands problem need to be solved at first, so query has to be modified like:

    select team, case sgn when  1 then 'increasing ' 
                          when -1 then 'decreasing '
                          else 'stable '
                 end ||'for '||cnt||' period(s)' status
    from (
      select team, dt, sgn, rn, count(1) over (partition by team, grp, sgn order by dt) cnt
      from (
        select team, dt, sgn, rn, sum(grp) over (partition by team order by dt) grp
        from (
          select team, dt, sgn, rn,  
                 case when sgn <> lag(sgn) over (partition by team order by dt) 
                 then 1 else 0 end grp
          from ( 
            select team, dt, rating, 
                   sign(rating - lag(rating) over (partition by team order by dt)) sgn, 
                   row_number() over (partition by team order by dt desc) rn 
            from teams ) t1) t2 ) t3 ) t4 
      where rn = 1
    

    so it counts only last consecutive series.

    dbfiddle demo