Search code examples
sqlhivewindow-functions

Populate the last no null value to the next null rows until the next available non null value in Hive


I am having the input data as below and new MD_flag is the out column based on md_flag column vaue .

sellingseason|item|location|wk|md_flag|New md_flag
Q1-23|BP608WZ271812|LBDI|Mar W5-23|TRUE|TRUE
Q1-23|BP608WZ271812|LBDI|Apr W1-23|TRUE|TRUE
Q1-23|BP608WZ271812|LBDI|Apr W2-23|TRUE|TRUE
Q1-23|BP608WZ271812|LBDI|Apr W3-23|TRUE|TRUE
Q1-23|BP608WZ271812|LBDI|Apr W4-23|TRUE|TRUE
Q2-23|BP608WZ271812|LBDI|May W1-23|TRUE|TRUE
Q2-23|BP608WZ271812|LBDI|May W2-23|TRUE|TRUE
Q2-23|BP608WZ271812|LBDI|May W3-23|TRUE|TRUE
Q2-23|BP608WZ271812|LBDI|May W4-23|TRUE|TRUE
Q2-23|BP608WZ271812|LBDI|Jun W1-23|FALSE|FALSE
Q2-23|BP608WZ271812|LBDI|Jun W2-23|TRUE|TRUE
Q2-23|BP608WZ271812|LBDI|Jun W3-23|NULL|TRUE
Q2-23|BP608WZ271812|LBDI|Jun W4-23|TRUE|TRUE
Q2-23|BP608WZ271812|LBDI|Jun W5-23|NULL|TRUE
Q2-23|BP608WZ271812|LBDI|Jul W1-23|NULL|TRUE
Q2-23|BP608WZ271812|LBDI|Jul W2-23|FALSE|FALSE
Q2-23|BP608WZ271812|LBDI|Jul W3-23|NULL|FALSE
Q2-23|BP608WZ271812|LBDI|Jul W4-23|NULL|FALSE
Q3-23|BP608WZ271812|LBDI|Aug W1-23|NULL|FALSE
Q3-23|BP608WZ271812|LBDI|Aug W2-23|NULL|FALSE

The logic for New md_flag is when ever Md_flag is Null we need to populate the prior non null value until the next non null value .

I tried using the Hive windows function but not able to achieve the required logic .Can some one please help me with the logic .

I see a similar post in the community but correct answer is not provided . How to populate null column values w/ next non-null value


Solution

  • You can use WINDOW function last_value with another argument inside it as TRUE to ignore nulls as follows:

    select t.*,
           coalesce(md_flag,
                    last_value(md_flag, true) 
                       over (partition by item, location order by wk) 
                   )   as  md_flag
    from your_table t