Search code examples
altair

vega-altair in jupyter notebook: can't add labels to a scatter chart with a regression line or vice versa?


I can plot a scatter chart with a regression line, and I can plot a scatter chart with a text label on each point. When I try to do both I get:

Javascript Error: Duplicate signal name: "param_1_Horsepower" This usually means there's a typo in your chart specification. See the javascript console for the full traceback.

Does anyone know what I'm doing wrong or any workaround?

# minimal example

import altair as alt
from vega_datasets import data

source = data.cars()

scatter = alt.Chart(source).mark_circle().encode(
    alt.X('Horsepower:Q').scale(zero=False),
    alt.Y('Miles_per_Gallon:Q').scale(zero=False),
    color='Origin:N',
    tooltip=['Name:N']
).interactive().properties(
    width="container")


reg = scatter.transform_regression('Horsepower', 'Miles_per_Gallon').mark_line(
     opacity=0.50, 
     shape='mark'
).transform_fold(
     ["best fit line"], 
     as_=["Regression", "y"]
).encode(alt.Color("Regression:N"))


labels=scatter.mark_text(align='left', baseline='middle', dx=7).encode(text='Name:N')

#scatter + labels 
#scatter  + reg
#labels  + reg
scatter + reg +  labels

Solution

  • If I make all three objects individual alt.Charts(df), rather than make the regression and teh labels from the same alt.Chart(df) instance as the points, it works (see below for code).

    import altair as alt
    from vega_datasets import data
    
    source = data.cars()
    
    scatter = alt.Chart(source).mark_circle().encode(
        alt.X('Horsepower:Q').scale(zero=False),
        alt.Y('Miles_per_Gallon:Q').scale(zero=False),
        color='Origin:N',
        tooltip=['Name:N']
    ).properties(
        width="container")
    
    
    reg = alt.Chart(source).encode(
        alt.X('Horsepower:Q').scale(zero=False),
        alt.Y('Miles_per_Gallon:Q').scale(zero=False),
        color='Origin:N',
        tooltip=['Name:N']
    ).transform_regression('Horsepower', 'Miles_per_Gallon').mark_line(
         opacity=0.50, 
         shape='mark'
    ).transform_fold(
         ["best fit line"], 
         as_=["Regression", "y"]
    ).encode(alt.Color("Regression:N"))
    
    
    labels=alt.Chart(source).encode(
        alt.X('Horsepower:Q').scale(zero=False),
        alt.Y('Miles_per_Gallon:Q').scale(zero=False),
        color='Origin:N',
        tooltip=['Name:N']
    ).mark_text(align='left', baseline='middle', dx=7).encode(text='Name:N')
    
    scatter + reg + labels