Search code examples
pythonpandasseabornlogarithmswarmplot

Python, Seaborn: Logarithmic Swarmplot has unexpected gaps in the swarm


Let's look at a swarmplot, made with Python 3.5 and Seaborn on some data (which is stored in a pandas dataframe df with column lables stored in another class. This does not matter for now, just look at the plot):

ax = sns.swarmplot(x=self.dte.label_temperature, y=self.dte.label_current, hue=self.dte.label_voltage, data = df)

Linear y axis

Now the data is more readable if plotted in log scale on the y-axis because it goes over some decades. So let's change the scaling to logarithmic:

ax.set_yscale("log")
ax.set_ylim(bottom = 5*10**-10)

Log y axis

Well I have a problem with the gaps in the swarms. I guess they are there because they have been there when the plot is created with a linear axis in mind and the dots should not overlap there. But now they look kind of strange and there is enough space to from 4 equal looking swarms. My question is: How can I force seaborn to recalculate the position of the dots to create better looking swarms?


Solution

  • mwaskom hinted to me in the comments how to solve this. It is even stated in the swamplot doku:

    Note that arranging the points properly requires an accurate transformation between data and point coordinates. This means that non-default axis limits should be set before drawing the swarm plot.

    Setting an existing axis to log-scale and use this for the plot:

        fig = plt.figure() # create figure
        rect = 0,0,1,1 # create an rectangle for the new axis
        log_ax = fig.add_axes(rect) # create a new axis (or use an existing one)
        log_ax.set_yscale("log") # log first
        sns.swarmplot(x=self.dte.label_temperature, y=self.dte.label_current, hue=self.dte.label_voltage, data = df, ax = log_ax)
    

    This yields in the correct and desired plotting behaviour: enter image description here