Search code examples
hivehiveqltop-n

Picking up latest 2 records from table in hive


Team, I have a scenario here. I need to pick 2 latest record through Hql.

I have tried rownumber but does not seems to be getting expected out put

Select
A.emp_ref_i,
A.last_updt_d,
A.start_date,
case when A.Last_updt_d=max(A.Last_updt_d) over (partition by A.emp_ref_i) 
and A.start_date=max(a.start_date) over (partition by A.emp_ref_i)
then 'Y' else 'N' end as Valid_f,
a.CHANGE
from
(
select
distinct(emp_ref_i),
last_updt_d,
start_date,
CHANGE
from
PR) A

Currently getting output as

    EMP_REF_I   LAST_UPDT_D start_date  Valid_f CHANGE
1   123         3/29/2020   2/3/2019    Y   CHG3
2   123         3/30/2019   2/4/2018    N   CHG2
3   123         3/29/2019   2/4/2018    N   CHG1

but required:

    EMP_REF_I   LAST_UPDT_D start_date  Valid_f CHANGE
1   123         3/29/2020   2/3/2019    Y   CHG3
2   123         3/30/2019   2/4/2018    N   CHG2

Solution

  • Use row_number and filter:

    select s.emp_ref_i,
           s.last_updt_d,
           s.start_date, 
           case when rn=1 then 'Y' else 'N' end Valid_f,
           s.change
    from 
    (
    Select
        A.*,
        row_number() over(partition by A.emp_ref_i order by a.Last_updt_d desc, a.start_date desc) rn
    from (...) A
    )s
    where rn<=2;