I would like to count unique combinations in two Polars columns.
df <- data.frame(a = c(2,0,1,0,0,0), b = c(1,1,1,0,0,1))
table(df)
0 1
0 2 2
1 0 1
2 0 1
import numpy as np
a = np.array([2,0,1,0,0,0])
b = np.array([1,1,1,0,0,1])
pd.crosstab(a, b)
0 1
0 2 2
1 0 1
2 0 1
Is this the proper way?
df = pl.DataFrame(
{
"a": [2,0,1,0,0,0],
"b": [1,1,1,0,0,1]
}
)
df.pivot(on="a", index="b", values="a", aggregate_function="len").fill_null(0)
I think you want to invert your "a" and "b" in your pivot. You can also use the sort_columns
parameter along with a .sort
at the end to get the same output
df.pivot(on='b',index='a',values='b',aggregate_function='len',sort_columns=True).fill_null(0).sort('a')
shape: (3, 3)
┌─────┬─────┬─────┐
│ a ┆ 0 ┆ 1 │
│ --- ┆ --- ┆ --- │
│ i64 ┆ u32 ┆ u32 │
╞═════╪═════╪═════╡
│ 0 ┆ 2 ┆ 2 │
│ 1 ┆ 0 ┆ 1 │
│ 2 ┆ 0 ┆ 1 │
└─────┴─────┴─────┘