Consider the following toy data:
import pandas as pd
import numpy as np
from plotly import graph_objects as go
from plotly.subplots import make_subplots
np.random.seed(42)
df = pd.DataFrame(
{
"val1": np.random.normal(0, 1, size=100),
"val2": np.random.normal(5, 2, size=100),
"cat": np.random.choice(["a", "b"], size=100),
}
)
which yields (top 5 rows):
val1 | val2 | cat | |
---|---|---|---|
0 | 0.496714 | 2.16926 | b |
1 | -0.138264 | 4.15871 | b |
2 | 0.647689 | 4.31457 | a |
3 | 1.52303 | 3.39545 | b |
4 | -0.234153 | 4.67743 | a |
My objective is to get two box plots each containing two boxes (one per category).
Following code:
fig = make_subplots(rows=2, cols=1, subplot_titles=["Value 1 dist", "Value 2 dist"])
fill_colors = {"a": "rgba(150, 25, 40, 0.5)", "b": "rgba(25, 150, 40, 0.5)"}
for i, val in enumerate(["val1", "val2"]):
for c in df["cat"].unique():
dff = df[df["cat"] == c]
fig.add_trace(
go.Box(
y=dff[val],
x=dff["cat"],
boxmean="sd",
name=c,
showlegend=True if val=="val1" else False,
fillcolor=fill_colors[c],
line={"color": fill_colors[c]},
),
row=i + 1,
col=1,
)
Brings me very close:
Here are the things I would like to adjust:
n
) colors used in the default cycle of Plotly? So the result is compatible with other plots. Note that I hardcoded the colors...showlegend=True if val=="val1" else False
.I posted in the past two related questions (here and here) but the answers there didn't help me tune me plot as I want.
Please refer to the official reference for how to get the color names for a standard color set. You can get the color names in a list.
As for controlling duplicate legends, I personally don't have a
problem with your method as I use it and it is a common approach,
but if I were to handle it programmatically, I would use set()
to
make it unique by adding the duplicate legend names. I learned this
Tips from this answer.
The third is to order by category, you can specify ascending or descending order by category.
This is a response from someone who did not get the expected answer. What was unsatisfactory about my previous answers? I will respond whenever possible.
import pandas as pd
import numpy as np
from plotly import graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px
# https://plotly.com/python/discrete-color/#color-sequences-in-plotly-express
plotly_default = px.colors.qualitative.Plotly
print(plotly_default)
fig = make_subplots(rows=2, cols=1, subplot_titles=["Value 1 dist", "Value 2 dist"])
fill_colors = {"a": plotly_default[0], "b": plotly_default[1]}
for i, val in enumerate(["val1", "val2"]):
for c in df["cat"].unique():
dff = df[df["cat"] == c]
fig.add_trace(
go.Box(
y=dff[val],
x=dff["cat"],
boxmean="sd",
name=c,
showlegend=True, # if val=="val1" else False,
fillcolor=fill_colors[c],
line={"color": fill_colors[c]},
opacity=0.5
),
row=i + 1,
col=1,
)
names = set()
fig.for_each_trace(
lambda trace:
trace.update(showlegend=False)
if (trace.name in names) else names.add(trace.name))
fig.update_xaxes(categoryorder='category ascending')
fig.update_layout(legend=dict(traceorder='reversed'))
fig.show()