Search code examples
sqlpostgresqlwindow-functions

Column based on a range in animal dataset- sql


Table:

num       value      category      name
503       8978       bird          woodpecker
502       7812       animal        
502       7812       animal        
501       7812       animal        
500       7812       animal        panther
499       7812       animal        
498       7812       animal     
467       7812       animal        elephant           

Within partition by of columns value and category, output column should be created like below:
When name is not null, output column takes up the value of name and same value to be filled within plus 2 and minus two range of num column.
Example, 500 has name not null, 500-2=498 and 500+2=502, within range of 498 to 502, output is filled with panther

Output:

  num       value      category      name             output
  503       8978       bird          woodpecker       
  502       7812       animal                         panther
  502       7812       animal                         panther
  501       7812       animal                         panther
  500       7812       animal        panther          panther
  499       7812       animal                         panther
  498       7812       animal                         panther
  467       7812       animal        elephant         elephant 

Solution

  • You can use a range window frame:

    select t.*,
           coalesce(name,
                    max(name) over (partition by category
                                    order by num
                                    range between 2 preceding and 2 following
                                   )
                   ) as imputed_name
    from t;
    

    Here is a db<>fiddle.

    EDIT:

    The support for "preceding" and "following" is relatively recent for range window frames in Postgres. In older versions, a lateral join is perhaps the simplest method:

    select t.*,
           coalesce(t.name, t2.name) as imputed_name
    from t left join lateral
         (select t2.name
          from t t2
          where t2.category = t.category and
                t2.name is not null and
                t2.num between t.num - 2 and t.num + 2
          limit 1
         ) t2
         on 1=1
    order by num desc;
    

    Here is a db<>fiddle for this version.