Search code examples
pythonpython-polars

How to calculate horizontal median


How to calculate horizontal median for numerical columns?

df = pl.DataFrame({"ABC":["foo", "bar", "foo"], "A":[1,2,3], "B":[2,1,None], "C":[1,2,3]})
print(df)

shape: (3, 4)
┌─────┬─────┬──────┬─────┐
│ ABC ┆ A   ┆ B    ┆ C   │
│ --- ┆ --- ┆ ---  ┆ --- │
│ str ┆ i64 ┆ i64  ┆ i64 │
╞═════╪═════╪══════╪═════╡
│ foo ┆ 1   ┆ 2    ┆ 1   │
│ bar ┆ 2   ┆ 1    ┆ 2   │
│ foo ┆ 3   ┆ null ┆ 3   │
└─────┴─────┴──────┴─────┘

I want to achieve the same as with the below pl.mean_horizontal, but get median instead of the mean. I did not find existing expression for this.

print(df.with_columns(pl.mean_horizontal(pl.col(pl.Int64)).alias("Horizontal Mean")))

shape: (3, 5)
┌─────┬─────┬──────┬─────┬─────────────────┐
│ ABC ┆ A   ┆ B    ┆ C   ┆ Horizontal Mean │
│ --- ┆ --- ┆ ---  ┆ --- ┆ ---             │
│ str ┆ i64 ┆ i64  ┆ i64 ┆ f64             │
╞═════╪═════╪══════╪═════╪═════════════════╡
│ foo ┆ 1   ┆ 2    ┆ 1   ┆ 1.333333        │
│ bar ┆ 2   ┆ 1    ┆ 2   ┆ 1.666667        │
│ foo ┆ 3   ┆ null ┆ 3   ┆ 3.0             │
└─────┴─────┴──────┴─────┴─────────────────┘

Solution

  • There's no median_horizontal() at the moment, but you could use

    df.with_columns(
        pl.concat_list(pl.col(pl.Int64)).list.median().alias("Horizontal Median")
    )
    
    shape: (3, 5)
    ┌─────┬─────┬──────┬─────┬───────────────────┐
    │ ABC ┆ A   ┆ B    ┆ C   ┆ Horizontal Median │
    │ --- ┆ --- ┆ ---  ┆ --- ┆ ---               │
    │ str ┆ i64 ┆ i64  ┆ i64 ┆ f64               │
    ╞═════╪═════╪══════╪═════╪═══════════════════╡
    │ foo ┆ 1   ┆ 2    ┆ 1   ┆ 1.0               │
    │ bar ┆ 2   ┆ 1    ┆ 2   ┆ 2.0               │
    │ foo ┆ 3   ┆ null ┆ 3   ┆ 3.0               │
    └─────┴─────┴──────┴─────┴───────────────────┘
    

    Or you can use numpy integration (but this will probably be slower):

    import numpy as np
    
    df.with_columns(
        pl.Series("Horizontal Median", np.nanmedian(df.select(pl.col(pl.Int64)), axis=1))
    )
    
    shape: (3, 5)
    ┌─────┬─────┬──────┬─────┬───────────────────┐
    │ ABC ┆ A   ┆ B    ┆ C   ┆ Horizontal Median │
    │ --- ┆ --- ┆ ---  ┆ --- ┆ ---               │
    │ str ┆ i64 ┆ i64  ┆ i64 ┆ f64               │
    ╞═════╪═════╪══════╪═════╪═══════════════════╡
    │ foo ┆ 1   ┆ 2    ┆ 1   ┆ 1.0               │
    │ bar ┆ 2   ┆ 1    ┆ 2   ┆ 2.0               │
    │ foo ┆ 3   ┆ null ┆ 3   ┆ 3.0               │
    └─────┴─────┴──────┴─────┴───────────────────┘