Search code examples
pythonmatplotlibseabornswarmplot

Overplot seaborn regplot and swarmplot


I would like to overplot a swarmplot and regplot in seaborn, so that I can have a y=x line through my swarmplot.

Here is my code:

import matplotlib.pyplot as plt
import seaborn as sns
    
sns.regplot(y=y, x=x, marker=' ', color='k')
sns.swarmplot(x=x_data, y=y_data)

I don't get any errors when I plot, but the regplot never shows on the plot. How can I fix this?

plot

EDIT: My regplot and swarmplot don't overplot and instead, plot in the same frame but separated by some unspecified y amount. If I flip them so regplot is above the call to swarmplot, regplot doesn't show up at all.

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

df = pd.DataFrame({"x":x_data,"y":y_data} )

sns.regplot(y="y", x="x", data= df, color='k', scatter_kws={"alpha" : 0.0})
sns.swarmplot(y="y", x="x", data= df)

updated plot

SECOND EDIT: The double axis solution from below works beautifully!


Solution

  • In principle the approach of plotting a swarmplot and a regplot simulatneously works fine.

    The problem here is that you set an empty marker (marker = " "). This destroys the regplot, such that it's not shown. Apparently this is only an issue when plotting several things to the same graph; plotting a single regplot with empty marker works fine.

    The solution would be not to specify the marker argument, but instead set the markers invisible by using the scatter_kws argument: scatter_kws={"alpha" : 0.0}.

    Here is a complete example:

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import numpy as np
    
    ## generate some data
    n=19; m=9
    y_data = []
    for i in range(m):
        a = (np.random.poisson(lam=0.99-float(i)/m,size=n)+i*.9+np.random.rand(1)*2)
        a+=(np.random.rand(n)-0.5)*2
        y_data.append(a*m)
    y_data = np.array(y_data).flatten()
    x_data = np.floor(np.sort(np.random.rand(n*m))*m)
    ## put them into dataframe
    df = pd.DataFrame({"x":x_data,"y":y_data} )
    
    ## plotting
    sns.regplot(y="y", x="x", data= df, color='k', scatter_kws={"alpha" : 0.0})
    sns.swarmplot(x="x", y="y", data= df)
    
    plt.show()
    

    enter image description here


    Concerning the edited part of the question:
    Since swarmplot is a categorical plot, the axis in the plot still goes from -0.5 to 8.5 and not as the labels suggest from 10 to 18. A possible workaround is to use two axes and twiny.

    fig, ax = plt.subplots()
    ax2 = ax.twiny()
    sns.swarmplot(x="x", y="y", data= df, ax=ax)
    sns.regplot(y="y", x="x", data= df, color='k', scatter_kws={"alpha" : 0.0},  ax=ax2)
    ax2.grid(False) #remove grid as it overlays the other plot