Search code examples
pythonpandasgroup-byconcatenation

How to groupby based on multiple columns in pandas?


I want to group by a dataframe based on multiple columns. For example to make this:

Country Type_1 Type_2  Type_3  Type_4  Type_5
China    A       B       C        D      E
Spain    A       A       R        B      C
Italy    B       A       B        R      R

Into this:

Country Type   Count
China   A       1
        B       1
        C       1 
        D       1
        E       1
Spain   A       2
        R       1
        B       1
        C       1
Italy   B       2  
        A       1   
        R       2

I tried to concat vertically the columns from Type_1 to Type_5, apply reset_index() and then trying to count. However i don't how to group vertically by country. Any ideas?

Thx


Solution

  • Do melt then groupby with size

    s = df.melt('Country').groupby(['Country','value']).size()
    Out[326]: 
    Country  value
    China    A        1
             B        1
             C        1
             D        1
             E        1
    Italy    A        1
             B        2
             R        2
    Spain    A        2
             B        1
             C        1
             R        1
    dtype: int64