I would like to calculate the standard deviation of dataframe row for the columns 'foo' and 'bar'.
I am able to find min,max and mean but not std.
import polars as pl
df = pl.DataFrame(
{
"foo": [1, 2, 3],
"bar": [6, 7, 8],
"ham": ["a", "b", "c"],
}
)
# there are _horizontal functions for sum, min, max
df = df.with_columns(
pl.sum_horizontal('foo','bar')
.round(2)
.alias('sum')
)
however, there is no std_horizontal
function.
df = df.with_columns(
pl.std_horizontal('foo','bar')
.round(2)
.alias('std')
)
# AttributeError: module 'polars' has no attribute 'std_horizontal'
Is there any better method available to compute standard deviation in such scenario ?
Until a dedicated std_horizontal
is added:
Another way to get a "row" or "horizontal" context is using the List API
df.with_columns(
sum = pl.concat_list("foo", "bar").list.sum(),
std = pl.concat_list("foo", "bar").list.std()
)
shape: (3, 5)
┌─────┬─────┬─────┬─────┬──────────┐
│ foo ┆ bar ┆ ham ┆ sum ┆ std │
│ --- ┆ --- ┆ --- ┆ --- ┆ --- │
│ i64 ┆ i64 ┆ str ┆ i64 ┆ f64 │
╞═════╪═════╪═════╪═════╪══════════╡
│ 1 ┆ 6 ┆ a ┆ 7 ┆ 3.535534 │
│ 2 ┆ 7 ┆ b ┆ 9 ┆ 3.535534 │
│ 3 ┆ 8 ┆ c ┆ 11 ┆ 3.535534 │
└─────┴─────┴─────┴─────┴──────────┘