Search code examples
pythonmatplotlibseaborn

Seaborn writes to incorrect subplot


I'm trying to make multiple plots in one figure and have identified a bug where axS[i] = sns.boxplot(plot) causes Seaborn to select the last subplot, as if it forgets the value of i. Notice the placement of the print statements and the pointers to the subplot--they should match, but the sns.boxplot line messes them up.

df = pd.read_csv(fold+file,index_col=1)
print(df.columns)
types = ['all cells','1 endothelial','2 immune','3 tumor','4 active fibroblast','5 stromal']
fig,axS = plt.subplots(6,1,sharex=True,figsize=(20,8))
for i,ty in enumerate(types):
    
    if i == 0:
        plot = df.loc[:,bioms]
    else:
        key = df.loc[:,'SVM_primary'] == ty
        sdf = df.loc[key,bioms]
        plot = sdf
    print('\n',i,ty)
    print(axS[i])               #First important print statement
    axS[i] = sns.boxplot(plot)
    print(axS[i])               #Second: should be identical, but it's not!
    axS[i].title.set_text(ty)                  
plt.xticks(rotation = 90)
fig.tight_layout()
plt.show()

The output is:

0 all cells

AxesSubplot(0.125,0.77;0.775x0.11)

AxesSubplot(0.125,0.11;0.775x0.11)


 1 1 endothelial

AxesSubplot(0.125,0.638;0.775x0.11)    # 

AxesSubplot(0.125,0.11;0.775x0.11)     # this pointer should be the same as the one above it


 2 2 immune

AxesSubplot(0.125,0.506;0.775x0.11)

AxesSubplot(0.125,0.11;0.775x0.11)


 3 3 tumor

AxesSubplot(0.125,0.374;0.775x0.11)

AxesSubplot(0.125,0.11;0.775x0.11)


 4 4 active fibroblast

AxesSubplot(0.125,0.242;0.775x0.11)

AxesSubplot(0.125,0.11;0.775x0.11)


 5 5 stromal

AxesSubplot(0.125,0.11;0.775x0.11)

AxesSubplot(0.125,0.11;0.775x0.11)   

The script correctly identifies i but in the two print(axS[i]) statements, the first correctly identifies a unique pointer to a new subplot, then after calling the SNS plot, it thinks it's the last subplot (as if i becomes 5 every time sns.boxplot is called). I hope I'm missing something.


Solution

  • This is not a bug with seaborn, this is a misunderstanding of how seaborn works (and a bit of misunderstanding with python in general).

    Starting with the general misunderstanding. Say you have a list.

    x = [1, 2, 3]
    

    If you index it and change the value, you update the list.

    x[1] = 5
    print(x) # [1, 5, 3]
    

    So, when you call sns.boxplot, it returns an axis object, which you use to overwrite axS[i].

    When you call sns.boxplot, you are not telling seaborn which axis to use, so by default it will use the last axis of the most recent figure.

    import matplotlib.pyplot as plt
    import seaborn as sns
    import numpy as np
    
    x = np.arange(0, 10, 0.01)
    y1 = x**2
    y2 = np.sin(x)
    
    fig, axes = plt.subplots(1, 2)
    a = sns.lineplot(x=x, y=y1)
    

    See how it used the last axis of the figure? And I saved the return to a, which I can check is equal to the last axis.

    print(a == axes[-1])  # True
    

    To plot in the correct axis, you have to pass it as a keyword argument to seaborn.

    fig, axes = plt.subplots(1, 2)
    a = sns.lineplot(x=x, y=y1, ax=axes[0])
    b = sns.lineplot(x=x, y=y2, ax=axes[1])
    

    And as a check:

    print(a == axes[0])  # True
    print(b == axes[1])  # True
    

    So, in conclusion, pass the axes to the boxplot call and do not overwrite your axes list with the returns (though it won't do anything once you plot to the correct axis).

    sns.boxplot(plot, ax=axS[i])