I want to use the FIRST_VALUE() function in order to have the first "updated_at" value. But what I want is to have the first value since the stage has changed.
with values as (
select 1 as deal, 2 as stage, '2020-11-10' as updated_at, '2020-11-10' as first_updated_at
union all
select 1 as deal, 2 as stage, '2020-11-11' as updated_at, '2020-11-10' as first_updated_at
union all
select 1 as deal, 3 as stage, '2020-11-12' as updated_at, '2020-11-12' as first_updated_at
union all
select 1 as deal, 4 as stage, '2020-11-13' as updated_at, '2020-11-13' as first_updated_at
union all
select 1 as deal, 4 as stage, '2020-11-14' as updated_at, '2020-11-13' as first_updated_at
union all
select 1 as deal, 2 as stage, '2020-11-15' as updated_at, '2020-11-15' as first_updated_at
union all
select 1 as deal, 2 as stage, '2020-11-16' as updated_at, '2020-11-15' as first_updated_at
)
select * from values
I am trying to use the FIRST VALUE function like this:
FIRST_VALUE(updated_at) OVER(PARTITION BY deal, stageORDER BY updated_at ASC
Am I missing something here or what I want is not possible?
Thanks in advance
You can do this using window functions. First, look at the previous stage to see if it is different from the previous row. Then use a cumulative max to take the updated_at
when this occurs:
with values as (
select 1 as deal, 2 as stage, '2020-11-10' as updated_at, '2020-11-10' as first_updated_at
union all
select 1 as deal, 2 as stage, '2020-11-11' as updated_at, '2020-11-10' as first_updated_at
union all
select 1 as deal, 3 as stage, '2020-11-12' as updated_at, '2020-11-12' as first_updated_at
union all
select 1 as deal, 4 as stage, '2020-11-13' as updated_at, '2020-11-13' as first_updated_at
union all
select 1 as deal, 4 as stage, '2020-11-14' as updated_at, '2020-11-13' as first_updated_at
union all
select 1 as deal, 2 as stage, '2020-11-15' as updated_at, '2020-11-15' as first_updated_at
union all
select 1 as deal, 2 as stage, '2020-11-16' as updated_at, '2020-11-15' as first_updated_at
)
select v.*,
max(case when stage <> prev_stage or prev_stage is null then updated_at end) over (partition by deal order by updated_at) as imputed_first_updated_at
from (select v.*,
lag(stage) over (partition by deal order by updated_at) as prev_stage
from values v
) v