I have a Polars dataframe:
df = pl.DataFrame(
{
"nrs": [1, 2, 3, None, 5],
"names": ["foo", "ham", "spam", "egg", None],
"random": np.random.rand(5),
"A": [True, True, False, False, False],
}
)
How can I compute the ratio between the number of rows where A==True
, to the number of rows where A==False
? Note that A
is always True
or False
. I found a solution, but it seems a bit clunky:
ntrue = df.filter(pl.col('A')==1).shape[0]
ratio = ntrue/(df.shape[0]-ntrue)
You can leverage polars' expression API as follow.
df.select(pl.col("A").sum() / pl.col("A").not_().sum()).item()
The summing works as A is a boolean column. If this is not the case, you can exchange pl.col("A")
for another corresponding boolean expression.