Search code examples
pythondataframecounter

How to add counter columns to get the frequency of items in dataframe


i have such dataframe:

product1 product2 product3 product4 product5 product6
straws orange melon chair bread milk
melon milk book coffee cake tea
bread bananas juice chair book straws

I wish to add 6 counting items columns along the index and counts per rows the number of times they appear in the dataframe.

CountProduct1 CountProduct2 CountProduct3 Countproduct4 Countproduct5
1 1 1 1 1
2 2 1 1 1
2 1 1 2 1

Thanks for your help.


Solution

  • Given the modified explanation, here is a way to do it.

    1. cumulative counts

    cnt = df.apply(lambda g: g.value_counts(), axis=1).fillna(0).astype(int).cumsum()
    >>> cnt
       bananas  book  bread  cake  chair  coffee  juice  melon  milk  orange  straws  tea
    0  0        0     1      0     1      0       0      1      1     1       1       0  
    1  0        1     1      1     1      1       0      2      2     1       1       1  
    2  1        2     2      1     2      1       1      2      2     1       2       1  
    

    2. replace each item by the corresponding count, row by row

    out = pd.DataFrame([
        r.map(c) for (_, r), (_, c) in zip(df.T.items(), cnt.T.items())
    ]).set_axis([f'Count{k}' for k in df.columns], axis=1)
    
    >>> out
       Countproduct1  Countproduct2  Countproduct3  Countproduct4  Countproduct5  Countproduct6
    0  1              1              1              1              1              1            
    1  2              2              1              1              1              1            
    2  2              1              1              2              2              2