Search code examples
pythonpython-3.xdataframecountpython-polars

Compute the ratio between the number of rows where A=True, to the number of rows where A=False


I have a Polars dataframe:

df = pl.DataFrame(
    {
        "nrs": [1, 2, 3, None, 5],
        "names": ["foo", "ham", "spam", "egg", None],
        "random": np.random.rand(5),
        "A": [True, True, False, False, False],
    }
)

How can I compute the ratio between the number of rows where A==True, to the number of rows where A==False? Note that A is always True or False. I found a solution, but it seems a bit clunky:

ntrue = df.filter(pl.col('A')==1).shape[0]
ratio = ntrue/(df.shape[0]-ntrue)

Solution

  • You can leverage polars' expression API as follow.

    df.select(pl.col("A").sum() / pl.col("A").not_().sum()).item()
    

    The summing works as A is a boolean column. If this is not the case, you can exchange pl.col("A") for another corresponding boolean expression.