How to find 'not null' data in polars

How to find the rightmost 'not null' data in each row?

import polars as pl

data_df = pl.from_repr("""
┌──────┬──────┬──────┬──────┐
│ d1   ┆ d2   ┆ d3   ┆ d4   │
│ ---  ┆ ---  ┆ ---  ┆ ---  │
│ i64  ┆ i64  ┆ i64  ┆ i64  │
╞══════╪══════╪══════╪══════╡
│ 20   ┆ 37   ┆ 48   ┆ 50   │
│ 31   ┆ 15   ┆ 4    ┆ null │
│ 56   ┆ 27   ┆ null ┆ null │
│ 44   ┆ 36   ┆ 88   ┆ 9    │
│ 10   ┆ null ┆ null ┆ null │
│ null ┆ null ┆ null ┆ null │
└──────┴──────┴──────┴──────┘
""")

Desired result:

┌──────┬──────┬──────┬──────┬────────┐
│ d1   ┆ d2   ┆ d3   ┆ d4   ┆ result │
│ ---  ┆ ---  ┆ ---  ┆ ---  ┆ ---    │
│ i64  ┆ i64  ┆ i64  ┆ i64  ┆ i64    │
╞══════╪══════╪══════╪══════╪════════╡
│ 20   ┆ 37   ┆ 48   ┆ 50   ┆ 50     │
│ 31   ┆ 15   ┆ 4    ┆ null ┆ 4      │
│ 56   ┆ 27   ┆ null ┆ null ┆ 27     │
│ 44   ┆ 36   ┆ 88   ┆ 9    ┆ 9      │
│ 10   ┆ null ┆ null ┆ null ┆ 10     │
│ null ┆ null ┆ null ┆ null ┆ null   │
└──────┴──────┴──────┴──────┴────────┘

Solution

There is pl.coalesce to get the first non-null.

You can pass the reversed columns to get the last.

df.with_columns(result = pl.coalesce(reversed(df.columns)))

shape: (6, 5)
┌──────┬──────┬──────┬──────┬────────┐
│ d1   ┆ d2   ┆ d3   ┆ d4   ┆ result │
│ ---  ┆ ---  ┆ ---  ┆ ---  ┆ ---    │
│ i64  ┆ i64  ┆ i64  ┆ i64  ┆ i64    │
╞══════╪══════╪══════╪══════╪════════╡
│ 20   ┆ 37   ┆ 48   ┆ 50   ┆ 50     │
│ 31   ┆ 15   ┆ 4    ┆ null ┆ 4      │
│ 56   ┆ 27   ┆ null ┆ null ┆ 27     │
│ 44   ┆ 36   ┆ 88   ┆ 9    ┆ 9      │
│ 10   ┆ null ┆ null ┆ null ┆ 10     │
│ null ┆ null ┆ null ┆ null ┆ null   │
└──────┴──────┴──────┴──────┴────────┘

You can implement it "manually" by creating a list and dropping the nulls and taking the last element.

df.with_columns( 
   pl.concat_list(pl.all())
     .list.drop_nulls()
     .list.last()
     .alias("result")
)

Another option is to use pl.reduce

df.with_columns(
   pl.reduce(lambda left, right: right.fill_null(left), pl.all())
     .alias("result")
)