Search code examples
pythonpandasnumpystatisticscrosstab

Crosstab with the same variables in rows and columns


I have the following dataframe:

       A      B     C
0   True  False  True
1  False   True  True
2   True   True  True
3   True  False  True

I want to find the number of each combination for 'A', 'B' and 'C'. For example, if I have True for 'A' and 'C' in the first, third and fourth rows the number is equal to 3.

Expected output:

   A  B  C
A  3  1  3
B  1  2  2
C  3  2  4

I don't have any idea how I can achieve this with Pandas. Maybe you can also tell me if this crosstab has a special name.


Solution

  • To add to @Andy L.'s answer- you don't have to convert dataframe to numpy:

    df=df.astype(int)
    res=df.T@df
    

    Outputs:

       A  B  C
    A  3  1  3
    B  1  2  2
    C  3  2  4