I've been trying to find a solution to this since some days ago. I have the following dataset.
|id|order|certain_event|order_of_occurrence|
|--|-----|-------------|-------------------|
|a |1 |NULL |NULL |
|a |2 |NULL |NULL |
|a |3 |NULL |NULL |
|a |4 |NULL |NULL |
|a |5 |4 |1 |
|a |6 |NULL |NULL |
|a |7 |NULL |NULL |
|a |8 |4 |2 |
|a |9 |NULL |NULL |
The desired output consists in replacing the null values from the order_of_occurrence column with the next non-null value. Like this:
|id|order|certain_event|order_of_occurrence|
|--|-----|-------------|-------------------|
|a |1 |NULL |1 |
|a |2 |NULL |1 |
|a |3 |NULL |1 |
|a |4 |NULL |1 |
|a |5 |4 |1 |
|a |6 |NULL |2 |
|a |7 |NULL |2 |
|a |8 |4 |2 |
|a |9 |NULL |NULL |
I've tried using a subquery for retrieving the non-null values from the order of occurrence column, but I get more than one value returned. Like the following:
SELECT a.*,
CASE
WHEN a.order_of_occurrence IS NOT NULL THEN a.order_of_occurence
WHEN a.order_of_occurence IS NULL THEN (SELECT B.ORDER_OF_OCCURENCE FROM dataset AS B
WHERE B.ORDER_OF_OCCURRENCE IS NOT NULL)
END AS corrected_order
FROM dataset AS a
Thanks!
This is a simple task for the IGNORE NULLS option in FIRST/LAST_VALUE:
last_value(order_of_occurrence IGNORE NULLS)
over (partition by id
order by "order" DESC
rows unbounded preceding)