Search code examples
visualizationaltairvega-lite

Consolidating multiple color legends in layered and faceted Altair plots


In Altair, if I layer two plots with color scales on different attributes and then also facet them I end up with multiple copies of the color legend. If there's just one color scale it correctly resolves them into a single legend, but once I create a second orthogonal scale (e.g. if I'm plotting one set of points with scale A and a separate set of lines with scale B) it duplicates them on every facet.

Here's a reproducible example:

import pandas as pd
import numpy as np
import altair as alt
np.random.seed(0)

## simulate some timeseries data
# date
data = pd.DataFrame({
    "date": pd.date_range("2024-01-01", "2024-04-30"),
})
# true value (just some weekday effects, log-linear growth, and noise)
data["y"] = (
    np.log(np.arange(data.shape[0])+1)
    + np.random.normal(scale=3, size=7)[data["date"].dt.weekday]
    + np.random.normal(size=data.shape[0])
)
# coverage of the data (0 to 1, skewed towards 1)
data["coverage"] = np.random.beta(3, 1, size=data.shape[0])
# observed value (truth x coverage)
data["y_obs"] = data["y"] * data["coverage"]


## fake some simple model fits
# model A: 7-day rolling quantile
model_a = pd.DataFrame({
    "model": "A",
    "date": data["date"],
    "median": data["y"].rolling(7).median(),
    "p90": data["y"].rolling(7).quantile(.9)
})
# model B: 21-day rolling quantile
model_b = pd.DataFrame({
    "model": "B",
    "date": data["date"],
    "median": data["y"].rolling(14).median(),
    "p90": data["y"].rolling(14).quantile(.9)
})
# combine models
models = pd.concat([model_a, model_b]).melt(id_vars=["model", "date"], var_name="estimate")

## create the chart
# combine all data so one source can be used for the layered chart and faceted by month
source = pd.concat([models, data])
source["month"] = source["date"].dt.month
# overall chart
chart = alt.Chart(
    source,
    title = "Model Fits"
)
# predictions
lines = (
    chart
    .transform_filter(alt.datum.model) # filter to model results
    .mark_line(opacity = .7)
    .encode(
        x = alt.X("date:T"), 
        y = alt.Y("value:Q"), 
        color = alt.Color("model"),
        strokeDash = alt.StrokeDash("estimate"),
     )
)
# data
points = (
    chart
    .transform_filter(alt.datum.y_obs) # filter to data
    .mark_point(filled = True)
    .encode(
        x = alt.X("date:T"),
        y = alt.Y("y_obs:Q").title("value"),
        color = alt.Color("coverage:Q").scale(domain=[0,1], scheme="spectral"),
    )
)
# combine them and facet
alt.layer(lines, points).facet("month", columns = 2).resolve_scale(x="independent")

which results in this: Example of duplicated Altair legends

The estimates (strokeDash) legend shows up just once, but coverage and model are repeated. If I remove either of them from the encodings, the duplication is resolved.

Is there any way to consolidate these legends?


Solution

  • In this specific case, if you replace color = alt.Color("model"), with stroke = alt.Stroke("model"), You get a single legend: plot