Search code examples
sqlwindow-functionsimpala

ROW_NUMBER( ) OVER in impala


I have a use case where I need to use ROW_NUMBER() over PARTITION: Something like:

SELECT
  Column1 , Column 2
  ROW_NUMBER() OVER (
    PARTITION BY ACCOUNT_NUM
    ORDER BY FREQ, MAN, MODEL) as LEVEL
FROM
  TEST_TABLE

I need a workaround for this in Impala. Unfortunately Impala does not support sub queries and also does not support ROW_NUMBER() OVER functionality. Thank you for your help.


Solution

  • Impala is rather limited for this type of query. With some assumptions, this query is possible:

    • The four columns in the partitioning clause are never NULL
    • The four columns in the partitioning clause uniquely identify a row

    The query is rather ugly and expensive:

    select tt.column1, tt.column2, count(*) as level
    from test_table tt join
         test_table tt2
         on tt.account_num = tt2.account_num and
            (tt2.freq < tt.freq or
             tt2.freq = tt.freq and tt2.man < t.man or
             tt2.freq = tt.freq and tt2.man = t.man and tt2.model <= t.model
            )
    group by tt.column1, tt.column2, tt.account_num, tt.freq, tt.man, tt.model;