I have a table as follows:
a b c d e
0 1 0 1 0
1 0 0 0 0
I want to create a column RESULT
that is a concatenation of column names only if the row has a value of 1
.
a b c d e RESULT
0 1 0 1 0 bd
1 0 0 0 0 a
Whats the most efficient way of doing this with polars?
I can do this via a map_elements
, but I wonder if there is a more efficient way.
The general approach is usually to use when/then
and loop over .columns
In this case you want the column name or an empty string.
You can pass this directly to .concat_str()
to combine the result.
df = pl.DataFrame({
'a': [0, 1], 'b': [1, 0], 'c': [0, 0], 'd': [1, 0], 'e': [0, 0]
})
df.with_columns(RESULT =
pl.concat_str(
pl.when(pl.col(col) == 1).then(pl.lit(col)).fill_null("")
for col in df.columns
)
)
shape: (2, 6)
┌─────┬─────┬─────┬─────┬─────┬────────┐
│ a ┆ b ┆ c ┆ d ┆ e ┆ RESULT │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ i64 ┆ i64 ┆ i64 ┆ str │
╞═════╪═════╪═════╪═════╪═════╪════════╡
│ 0 ┆ 1 ┆ 0 ┆ 1 ┆ 0 ┆ bd │
│ 1 ┆ 0 ┆ 0 ┆ 0 ┆ 0 ┆ a │
└─────┴─────┴─────┴─────┴─────┴────────┘