Search code examples
pythonpandasplotlyheatmappandas-melt

How to transform pandas df containing boolean columns into a heatmap compatible table?


I have a table something like this:

A B C
1 1 1 1
2 0 1 1
3 0 0 1

Where each cell is containing whether the row is connected to the specified column.

I want to transform this into a table like this:

A B C
A 1 1 1
B 1 2 2
C 1 2 3

Where each cell contains the number of rows (from the original table) that has both connected to the specific column and row name in the second table.

For example, the 3 in the second table means that there are 3 rows in the original table that is connected to the B and C columns.

The goal is to plot a heatmap from the second table using plotly.


Solution

  • code:

    import pandas as pd
    import numpy as np 
    import plotly.express as px
    
    
    
    # data for data frame
    data = {'A': {0: 1, 1: 0, 2: 0}, 
            'B': {0: 1, 1: 1, 2: 0}, 
            'C': {0: 1, 1: 1, 2: 1}}
    
    # create dataframe
    df = pd.DataFrame(data)
    
    # list for our combination values
    values = []
    
    # loop over each column combination - 
    # AA, AB, AC, BA, BB, BC, CA, CB, CC
    for row in df.columns:
        # create a list for each row value
        # first iteration [AA, AB, AC]
        # Second iteration [BA, BB, BC]
        # third iteration [CA, CB, CC]
        temp_val = []
        for col in df.columns:
            # get number of rows that are connected 
            val = sum(df[row] & df[col])
            # add to temp list
            temp_val.append(val)
        # add the row to all rows list
        values.append(temp_val)
    
    # create data frame
    heat_df = pd.DataFrame(values, index=df.columns, columns=df.columns)
    
    # plot heatmap
    fig = px.imshow(heat_df)
    fig.show()
    

    Dataframe output:

        A   B   C
    A   1   1   1
    B   1   2   2
    C   1   2   3
    

    Heatmap:

    Output