Search code examples
pythonpython-polars

How does one create a cross tab table?


I would like to count unique combinations in two Polars columns.

In R

df <- data.frame(a = c(2,0,1,0,0,0), b = c(1,1,1,0,0,1))
table(df) 
    0 1
  0 2 2
  1 0 1
  2 0 1

In Pandas

import numpy as np
a = np.array([2,0,1,0,0,0])
b = np.array([1,1,1,0,0,1])
pd.crosstab(a, b)
    0   1       
0   2   2
1   0   1
2   0   1

In Polars

Is this the proper way?

df = pl.DataFrame(
    {
        "a": [2,0,1,0,0,0],
        "b": [1,1,1,0,0,1]
    }
)
df.pivot(on="a", index="b", values="a", aggregate_function="len").fill_null(0)

Solution

  • I think you want to invert your "a" and "b" in your pivot. You can also use the sort_columns parameter along with a .sort at the end to get the same output

    df.pivot(on='b',index='a',values='b',aggregate_function='len',sort_columns=True).fill_null(0).sort('a')
    shape: (3, 3)
    ┌─────┬─────┬─────┐
    │ a   ┆ 0   ┆ 1   │
    │ --- ┆ --- ┆ --- │
    │ i64 ┆ u32 ┆ u32 │
    ╞═════╪═════╪═════╡
    │ 0   ┆ 2   ┆ 2   │
    │ 1   ┆ 0   ┆ 1   │
    │ 2   ┆ 0   ┆ 1   │
    └─────┴─────┴─────┘