Search code examples
sqldatabasepostgresqlrdbms

Replace first and last row having null values or missing values with previous/next available value in Postgresql12


I am a newbiew to postgresql. I want to replace my first and last row of table,T which has null or missing values, with next/previous available values. Also, if there are missing values in the middle, it should be replaced with previous available value. For example:

id   value   EXPECTED
1              1
2    1         1
3    2         2
4              2
5    3         3
6              3

I am aware that there are many similar threads, but none seems to address this problem where the start and end also have missing values (including some missing in the middle rows). Also some of the concepts such as first_row ,partition by, top 1(which does not work for postgres) are very hard to grasp as a newbie.

So far i have referred to the following threads: value from previous row and Previous available value

Could someone kindly direct me in the right direction to address this problem? Thank you


Solution

  • Unfortunately, Postgres doesn't have the ignore nulls option on lead() and lag(). In your example, you only need to borrow from the next row. So:

    select t.*,
           coalesce(value, lag(value) over (order by id), lead(value) over (order by id)) as expected
    from t;
    

    If you had multiple NULLs in a row, then this is trickier. One solution is to define "groups" based on when a value starts or stops. You can do this with a cumulative count of the values -- ascending and descending:

    select t.*,
           coalesce(value,
                    max(value) over (partition by grp_before),
                    max(value) over (partition by grp_after)
                   ) as expected
    from (select t.*,
                 count(value) over (order by id asc) as grp_before,
                 count(value) over (order by id desc) as grp_after
          from t
         ) t;
    

    Here is a db<>fiddle.