Search code examples
pythonpython-3.xpandasnumpybroadcast

row-wise sum of boolean arrays in pandas DataFrame


Given the dataframe df:

import pandas as pd
df = pd.DataFrame({'A':[np.array([True, False, True, False]),np.array([True, True, False, False])]})

                        A
0  [True, False, True, False]

1  [True, True, False, False]

How can I get the row-wise sum of the integer version of the boolean? for example:

cmd(A) = [2, 1, 1, 0]

What command can do this?


Solution

  • Another way (probably faster without all the conversions to dataframe and lists):

    (df.A.values+0).sum(0)
    #[2 1 1 0]
    

    The +0 is to convert boolean to int and the sum is along axis 0 (row-wise).