Search code examples
sqlhive

Rank in SQL special case


Write Hive SQL query to get the output from the input shown below.
I am trying to add rank column which is ordered by event_ts. But if there is same item_id for any consecutive rows, they should get the same rank. Partition by item_id will not work here as it will rank each partition window starting from 1. But I want the rank to monotonically increase.

NOTE: dense_rank() will not solve this

Input:

event_ts item_id
T1 AA
T2 XX
T3 CC
T4 CC
T5 CC
T6 AA

Output:

event_ts item_id rank
T1 AA 1
T2 XX 2
T3 CC 3
T4 CC 3
T5 CC 3
T6 AA 4

Solution

  • This should give you what you're looking for. It will keep the same rank if the row above has the same value, then will continue incrementally ranking.

    Select
    event_ts,
    item_id,
    sum(inc) over(order by event_ts) as rank_order
    from (
      select *,
        case when lag(item_id) over (order by event_ts) = item_id
          then 0 else 1 end as inc   
      from table1
    ) table2
    

    table1

    event_ts item_id
    T1 AA
    T2 XX
    T3 CC
    T4 CC
    T5 CC
    T6 AA

    Results

    event_ts item_id rank_order
    T1 AA 1
    T2 XX 2
    T3 CC 3
    T4 CC 3
    T5 CC 3
    T6 AA 4