Search code examples
pythonplotlydata-visualizationplotly-python

Plotly box plot with multiple categories


Consider the following toy data:

import pandas as pd
import numpy as np
from plotly import graph_objects as go
from plotly.subplots import make_subplots

np.random.seed(42)

df = pd.DataFrame(
    {
        "val1": np.random.normal(0, 1, size=100),
        "val2": np.random.normal(5, 2, size=100),
        "cat": np.random.choice(["a", "b"], size=100),
    }
)

which yields (top 5 rows):

val1 val2 cat
0 0.496714 2.16926 b
1 -0.138264 4.15871 b
2 0.647689 4.31457 a
3 1.52303 3.39545 b
4 -0.234153 4.67743 a

My objective is to get two box plots each containing two boxes (one per category).

Following code:

fig = make_subplots(rows=2, cols=1, subplot_titles=["Value 1 dist", "Value 2 dist"])

fill_colors = {"a": "rgba(150, 25, 40, 0.5)", "b": "rgba(25, 150, 40, 0.5)"}

for i, val in enumerate(["val1", "val2"]):
    for c in df["cat"].unique():
        dff = df[df["cat"] == c]
        fig.add_trace(
            go.Box(
                y=dff[val],
                x=dff["cat"],
                boxmean="sd",
                name=c,
                showlegend=True if val=="val1" else False,
                fillcolor=fill_colors[c],
                line={"color": fill_colors[c]},
            ),
            row=i + 1,
            col=1,
        )

Brings me very close:

Initial result

Here are the things I would like to adjust:

  1. How do I get, programmatically, the first 2 (or n) colors used in the default cycle of Plotly? So the result is compatible with other plots. Note that I hardcoded the colors...
  2. The legend on the left; is there a more programmatic way to have only single legend? Note that I used showlegend=True if val=="val1" else False.
  3. Bonus: How can I control the order of the boxes (i.e. which category comes first?)

I posted in the past two related questions (here and here) but the answers there didn't help me tune me plot as I want.


Solution

    1. Please refer to the official reference for how to get the color names for a standard color set. You can get the color names in a list.

    2. As for controlling duplicate legends, I personally don't have a problem with your method as I use it and it is a common approach, but if I were to handle it programmatically, I would use set() to make it unique by adding the duplicate legend names. I learned this Tips from this answer.

    3. The third is to order by category, you can specify ascending or descending order by category.

    This is a response from someone who did not get the expected answer. What was unsatisfactory about my previous answers? I will respond whenever possible.

    import pandas as pd
    import numpy as np
    from plotly import graph_objects as go
    from plotly.subplots import make_subplots
    import plotly.express as px
    
    # https://plotly.com/python/discrete-color/#color-sequences-in-plotly-express
    plotly_default = px.colors.qualitative.Plotly
    print(plotly_default)
    
    fig = make_subplots(rows=2, cols=1, subplot_titles=["Value 1 dist", "Value 2 dist"])
    
    fill_colors = {"a": plotly_default[0], "b": plotly_default[1]}
    
    for i, val in enumerate(["val1", "val2"]):
        for c in df["cat"].unique():
            dff = df[df["cat"] == c]
            fig.add_trace(
                go.Box(
                    y=dff[val],
                    x=dff["cat"],
                    boxmean="sd",
                    name=c,
                    showlegend=True, # if val=="val1" else False,
                    fillcolor=fill_colors[c],
                    line={"color": fill_colors[c]},
                    opacity=0.5
                ),
                row=i + 1,
                col=1,
            )
    names = set()
    fig.for_each_trace(
        lambda trace:
            trace.update(showlegend=False)
            if (trace.name in names) else names.add(trace.name))
    
    fig.update_xaxes(categoryorder='category ascending')
    fig.update_layout(legend=dict(traceorder='reversed'))
    fig.show()
    

    enter image description here