Search code examples
pythonregressionseabornlmplot

Change each regression line styling using in a multiple regressions plot Python


I am currently trying to plot two regression lines for my data split by a categorical attribute (which is either freedom or happiness scores). My current qualm is that I need color to encode another separate categorical attribute in my graph (GNI/capita brackets). Having a mix of colors seemed confusing so I decided to distinguish the data points using different markers instead. However, I am having trouble changing just one of the regression lines to a dashed line as they are identical. I don't even want to think about how I am going to create a legend for all of this. If you think this is an ugly graph, I agree, but certain circumstances mandate I have four attributes encoded in a single graph. By the way, open to any suggestions at all on a better way to do this - if there is any. An example of my current graph is below and would appreciate any help!

sns.lmplot(data=combined_indicators, x='x', y='y', hue='Indicator', palette=["#000620"], markers=['x', '.'], ci=None)
plt.axvspan(0,1025, alpha=0.5, color='#de425b', zorder=-1)
plt.axvspan(1025,4035, alpha=0.5, color='#fbb862', zorder=-1)
plt.axvspan(4035,12475, alpha=0.5, color ='#afd17c', zorder=-1)
plt.axvspan(12475,100000, alpha=0.5, color='#00876c', zorder=-1)
plt.title("HFI & Happiness Regressed on GNI/capita")
plt.xlabel("GNI/Capita by Purchasing Power Parity (2017 International $)")
plt.ylabel("Standard Indicator Score (0-10)")

My current figure rears its ugly head


Solution

  • To my knowledge, there is no easy way to change the style of the regression line in lmplot. But you can achieve your goal if you use regplot instead of lmplot, the drawback being that you have to implement the hue-splitting "by hand"

    x_col = 'total_bill'
    y_col = 'tip'
    hue_col = 'smoker'
    df = sns.load_dataset('tips')
    
    markers = ['x','.']
    colors = ["#000620", "#000620"]
    linestyles = ['-','--']
    
    plt.figure()
    for (hue,gr),m,c,ls in zip(df.groupby(hue_col),markers,colors,linestyles):
        sns.regplot(data=gr, x=x_col, y=y_col, marker=m, color=c, line_kws={'ls':ls}, ci=None, label=f'{hue_col}={hue}')
    ax.legend()
    

    enter image description here