Search code examples
pythonaltair

Set intercept to 0 on a linear regression plot using Altair in Python


I am trying to plot a linear regression in python using altair. I want to set/force the intercept to be 0. Can't find it anywhere in the literature (apols if missing something).

Can someone please show me how to do it if this is possible? I have included my work so far which plots the regression line with its own intercept. THANKS!

# create some toy data
df = pd.DataFrame({'x': [ 931000, 772648, 635000, 510572, 509000, 496317, 453133, 441072, 404194, 380000],
                'y': [3000000, 2471414, 2050000, 1183849, 1800000, 1650000, 1480000, 1459866, 1150000, 1700000]})

# create a scatter plot
chart = alt.Chart(df, width=500, height=400).mark_point().encode(
    x=alt.X('x:Q', scale=alt.Scale(zero=False)),
    y='y:Q')

# create regression line
fit = chart.transform_regression('x', 'y',).mark_line(color='red')

# obtain the regression parameters
params = alt.Chart(df).transform_regression('x', 'y', params=True, ).mark_text(align='left').encode(
    x=alt.value(10),  # pixels from left
    y=alt.value(25),  # pixels from top
    text='params:N'
).transform_calculate(
    params='"R² = " + datum.rSquared + " : Beta = " + datum.coef[1] + " : Intercept = " + datum.coef[0]')

# plot
chart + fit + params

Solution

  • This is currently not possible in Altair because is has not been implemented in Vega yet. Since Altair builds on Vega-Lite which in turn builds on Vega, you can follow that issue for when the implementation might happen and add a compelling use case if you think it should be prioritized higher.

    In the meantime you would need to use a separate package such as statsmodels to compute the parameters for the regression line and then plot it using .mark_line() in Altair. How to set intercept to 0 with statsmodel - for multiple linear regression