Search code examples
pythonplotlyroundingboxplot

Rounding Numbers in a Quartile Figures of a Plotly Box Plot


I have been digging around a while trying to figure out how to round the numbers displayed in quartile figures displayed in the hover feature. There must be a straightforward to do this as it is with the x and y coordinates. In this case rounding to two decimals would be sufficient.

df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")

fig = go.Figure(data=go.Box(y=df['total_bill'],
                            name='total_bill',
                            boxmean=True,
                           )
               )

fig.update_layout(width=800, height=800,
                  hoverlabel=dict(bgcolor="white",
                                  font_size=16,
                                  font_family="Arial",
                                 )
                 )
fig.show()

enter image description here


Solution

  • Unfortunately this is something that it looks like Plotly cannot easily do. If you modify the hovertemplate, it will only apply to markers that you hover over (the outliers), and the decimals after each of the boxplot statistics will remain unchanged upon hovering. Another issue with plotly-python is that you cannot extract the boxplot statistics because this would require you to interact with the javascript under the hood.

    However, you can calculate the boxplot statistics on your own using the same method as plotly and round all of the statistics down to two decimal places. Then you can pass boxplot statistics: lowerfence, q1, median, mean, q3, upperfence to force plotly to construct the boxplot manually, and plot all the outliers as another trace of scatters.

    This is a pretty ugly hack because you are essentially redoing all of calculations Plotly already does, and then constructing the boxplot manually, but it does force the boxplot statistics to display to two decimal places.

    from math import floor, ceil
    from numpy import mean
    import pandas as pd
    import plotly.graph_objects as go
    
    df = pd.read_csv("https://raw.githubusercontent.com/plotly/datasets/master/tips.csv")
    
    ## calculate quartiles as outlined in the plotly documentation 
    def get_percentile(data, p):
        data.sort()
        n = len(data)
        x = n*p + 0.5
        x1, x2 = floor(x), ceil(x)
        y1, y2 = data[x1-1], data[x2-1] # account for zero-indexing
        return round(y1 + ((x - x1) / (x2 - x1))*(y2 - y1), 2)
    
    ## calculate all boxplot statistics
    y = df['total_bill'].values
    lowerfence = min(y)
    q1, median, q3 = get_percentile(y, 0.25), get_percentile(y, 0.50), get_percentile(y, 0.75)
    upperfence = max([y0 for y0 in y if y0 < (q3 + 1.5*(q3-q1))])
    
    ## construct the boxplot
    fig = go.Figure(data=go.Box(
        x=["total_bill"]*len(y),
        q1=[q1], median=[median], mean=[round(mean(y),2)],
        q3=[q3], lowerfence=[lowerfence],
        upperfence=[upperfence], orientation='v', showlegend=False,
        )
    )
    
    outliers = y[y>upperfence]
    fig.add_trace(go.Scatter(x=["total_bill"]*len(outliers), y=outliers, showlegend=False, mode='markers', marker={'color':'#1f77b4'}))
                   
    
    fig.update_layout(width=800, height=800,
                      hoverlabel=dict(bgcolor="white",
                                      font_size=16,
                                      font_family="Arial",
                                     )
                     )
    
    fig.show()
    

    enter image description here