Search code examples
sqlpostgresqlpgadminwindow-functionsgreenplum

Rank partition by the same value


I am working with GreenPlum engine, I am using pgAdmin to query the data, I have only read permissions so I can't make a function or a procedure, even I don't know why I can't work with variables.

This is my data and the desired result:

Pais    Campaña     Representante   Actividad   Racha   **Desired value**
96      20150302    758593197           1       1           1
96      20150303    758593197           1       2           2
96      20150304    758593197           1       3           3
96      20150305    758593197           0       1           1
96      20150306    758593197           1       4           1
96      20150307    758593197           0       2           1
96      20150308    758593197           0       3           2
96      20150309    758593197           1       5           1
96      20150310    758593197           0       4           1
96      20150311    758593197           0       5           2
96      20150312    758593197           0       6           3
96      20150313    758593197           0       7           4
96      20150314    758593197           1       6           1

This is one of my attempts:

Select 
    Pais,Campaña,Representante,Actividad,
    rank() over(partition by Pais,Representante,Actividad
                order by Pais,Campaña,Representante) as Racha
From TEMP20151109
Order By Campaña;

The desired value is a count of the consecutiveness of the Actividad value, I need to reset the count when Actividad is 0.


Solution

  • The trick is to form groups of consecutive rows with the same Actividad, then you can easily compute the row number (rn):

    SELECT Pais, Campana, Representante, Actividad
         , row_number() OVER (PARTITION BY Pais, Representante, Actividad, grp
                              ORDER BY Campana) AS rn
    FROM  (
       SELECT Pais, Campana, Representante, Actividad
            , row_number() OVER (PARTITION BY Pais, Representante ORDER BY Campana)
           -  row_number() OVER (PARTITION BY Pais, Representante, Actividad
                                 ORDER BY Campana) AS grp
       FROM   tbl
       ) sub
    ORDER  BY Campana;
    

    You have to repeat columns (or expressions) in the PARTITION BY clause in the outer query in addition to grp. It's comparatively cheap to repeat the same order of rows in the outer window function and just append another column, thus building on pre-sorted data.

    Some of the columns might be dropped from the PARTITION BY clause depending on the missing table definition (which columns are unique?) and WHERE conditions.

    SQL Fiddle with extended test case.

    Very similar case:

    Related answer with detailed explanation on dba.SE:

    BTW, it never makes sense to repeat columns in the ORDER BY clause of a window function that are used in the PARTITION clause (Pais and Representante in your original query). That's just noise without effect.