Search code examples
mysqlsqlwindow-functions

How to get repeatable partition as new one after another partition?


"Partition by" in OVER clause groups all of the values as unique, just like "Distinct" or "Group by" do.

This is how it works in my query with row_number():

    id st t  row_number
    -------------------
    1  1  1  1
    1  1  2  2
    1  1  3  3
    2  1  3  1
    1  2  4  1
    1  1  10 4

This is what I want:

    id st t  uniq_row_number
    ------------------
    1  1  1  1
    1  1  2  2
    1  1  3  3
    2  1  3  1
    1  2  4  1
    1  1  10 1

No matter if new string already was before, it's read as new partition after every change of partition. If partition repeats, so uniq_row_number gets +1. If new partition comes with new string: boom, it gets uniq_row_number 1.

My SQL query:

    SELECT id, st, t,
    row_number() OVER (PARTITION BY id, st ORDER BY id, st) cat_num,
    min(t) over (PARTITION BY id, st) min_t,
    max(t) over (PARTITION BY id, st) max_t
    FROM tabl ORDER BY t;

SQL code is here: http://sqlfiddle.com/#!18/d4290/2


Solution

  • This is called a "gaps-and-islands" problem. You need to define a group for each "island" of similar values. Then you can use row_number().

    The difference of row numbers is a convenient way to define the islands:

    select t.*,
           row_number() over (partition by id, seqnum_t - seqnum_it
                              order by t
                             ) as uniq_row_number
    from (select t.*,
                 row_number() over (order by t) as seqnum_t,
                 row_number() over (partition by id order by t) as seqnum_it,
          from t
         ) t;
    

    The best way to understand how this works is to look at the results of the subquery. You should be able to see how the difference of row numbers defines the groups that you care about.