Search code examples
pythonpandasplotlyplotly-dash

How to create HeatMap from Pandas dataframe with annotated values in percentage format and divisions between each cell?


I need to create a heatmap from a pandas dataframe using plotly which will show annotated percentage values with '%' symbol. There would also be divisions between each cell in the heatmap.

Day Time PizzaConsumptionFraction
1st March 00:00:00 0.7
1st March 01:00:00 0.6
... ... ...
1st March 23:00:00 0.6
2nd March 00:00:00 1
... ... ...
2nd March 23:00:00 0.8
  1. How can I create a heatMap from the above data using plotly. I can create heatmaps from pivot tables as shown below but not dataframes -
 kptable1 = kdf.pivot_table(index='bowling_team', columns='over', values='batsman', aggfunc='count', fill_value=0) / len(delivery)
        line_fig = px.imshow(kptable1, text_auto=".2%", aspect="equal")
  1. I need to show annotated values with % symbol. In plotly express that is done using -
px.imshow(kptable1, text_auto=".3%", aspect="equal")

But how to make divisions between each cell in plotly express?

  1. If I use plotly graph_objects to make divisions between cells in heatmap by setting xgap=4, ygap=4 then how to annotate the values in percentage format in plotly graph_objects.

I need to satisfy all the conditions and use either plotly express or plotly graph_objects.

Added -

Example code to show usage of xgap and ygap to show demarcation of some king between each cell in heatmap -

data = [go.Heatmap(
  z = events,
  y = days,
  x = hours,
  xgap = 5,
  ygap = 5,
  colorscale = 'Viridis'
)]

layout = go.Layout(
  title = 'Events per weekday & time of day',
  xaxis = dict(
    tickmode = 'linear'
  )
)

fig = go.Figure(data=data, layout=layout)

Solution

  • I think these solve your problem.

    Solution #1

    Here we use both libraries.

    import pandas as pd
    import plotly.express as px
    import plotly.graph_objects as go
    
    df = pd.DataFrame([["1st March", "00:00:00" ,0.77],
                       ["1st March", "01:00:00", 0.6],
                       ["1st March","23:00:00",0.6], 
                       ["2nd March","00:00:00",1],
                       ["2nd March","01:00:00",0.4],
                       ["2nd March","23:00:00",0.8]],
                      columns=["Day", "Time", "PCF"])
    
    df = pd.pivot_table(df, values='PCF', index='Day', columns=['Time'], sort=False).fillna(0)
    # df = df.pivot(index='Day', columns='Time')['PCF'].fillna(0)
    
    fig = px.imshow(df, x=df.columns, y=df.index, text_auto=True)
    fig = go.Figure(data=fig.data, layout=fig.layout)
    fig = fig.update_traces(text=df.applymap(lambda x: x).values, texttemplate="%{text}%", hovertemplate=None, xgap=5, ygap=5)
    fig.show()
    

    In the solution above, df.pivot() (it is commented) sorts the dataframe automatically based on Day and it has no parameter to set it to false (see here). You can use pd.pivot_table() instead and set sorting to false.

    Output 1: Heatmap 1

    Solution #2

    An alternative, if you want to use only the graph_objects library.

    import plotly.graph_objects as go
    
    # Some test data
    df = pd.DataFrame([["1st March", "00:00:00" ,0.77],
                       ["1st March", "01:00:00", 0.6],
                       ["1st March","23:00:00",0.6], 
                       ["2nd March","00:00:00",1],
                       ["2nd March","01:00:00",0.4],
                       ["2nd March","23:00:00",0.8]],
                      columns=["Day", "Time", "PCF"])
    
    
    data = [go.Heatmap(
      z = df["PCF"],
      y = df["Day"],
      x = df["Time"],
      xgap = 5,
      ygap = 5,
      colorscale = 'Viridis'
    )]
    
    layout = go.Layout(
      title = 'Events per weekday & time of day',
      xaxis = dict(
        tickmode = 'linear'
      )
    )
    
    # Use pd.pivot_table() if you don't want to sort
    df = df.pivot(index='Day', columns='Time')['PCF'].fillna(0)
    
    fig = go.Figure(data=data, layout=layout)
    fig = fig.update_traces(text=df.applymap(lambda x: x).values, texttemplate="%{text}%", hovertemplate=None)
    fig.show()
    

    Output 2: Heatmap 2

    UPDATED - Solution using only plotly.express

    And the Dataframe is not sorted, like you asked. The output heatmap proves that.

    import pandas as pd
    import plotly.express as px
    
    df = pd.DataFrame([["1st March", "00:00:00" ,0.77],
                       ["1st March", "01:00:00", 0.6],
                       ["10th March","23:00:00",0.6], 
                       ["11th March","00:00:00",1],
                       ["2nd March","01:00:00",0.4],
                       ["22th March","23:00:00",0.8]],
                      columns=["Day", "Time", "PCF"])
    
    df = pd.pivot_table(df, values='PCF', index='Day', columns=['Time'], sort=False).fillna(0)
    fig = px.imshow(df, x=df.columns, y=df.index, text_auto=True, title="Events per weekday & time of day")
    fig = fig.update_traces(text=df.applymap(lambda x: x).values, texttemplate="%{text}%", hovertemplate=None, xgap=5, ygap=5)
    fig.show()
    

    Output 3: Heatmap 3