Search code examples
pythonseabornbar-chartline-plot

How to Combine a Hue-Separated Bar Plot with a Single Line on the Secondary Axis


In Python, I have a dataframe df with columns x, class, ratio and cnt.

I obtained this aggregating some data before, so I know that there is a unique row for each (x, class) pair. The idea is that I want to see the ratio and cnt for each x split by class.

To display ratio, I want to use a barplot, and to display cnt I want to use a lineplot. This should be done on a dual-axis.

Based on the many answers to similar questions that I read, I tried the following:

plt.figure(figsize=(15,8))
ax1 = sns.barplot(x="x", y="ratio", hue="class", data=df)
ax2 = ax1.twinx()
sns.pointplot(x="x", y='cnt', data=df, hue="class", color='red', ax=ax2)
ax2.grid(False)

The problem is that the output that this gives is not really what I need, as this outputs many lines, one for each class.

What I want is to have a unique lineplot for all values of cnt. I do not really care about splitting by class for the lineplot. I was just doing this to ensure that the markers would appear on the correct place, on top of each bar. But this is not the output I get.

EDIT:

As my question was not clear, in the image below I show better what I meant that I needed. The plot was made by @JohanC using dodge. However, I am looking for a way to construct the black line. I do not really care about splitting the lineplot (or pointplot) by hue too. Worst case, I would accept having many lines, a line per x value across its hue values (i.e. the same black curve but deleting the segments linking each set of bars).

enter image description here


Solution

  • In general

    By default, sns.pointplot() uses a small "dodge" distance. Adding dodge=d might work in your case. Here d is calculated as 0.8 (the distance over which bars are spread) multiplied by (h-1)/h where h is the number of hue categories. This multiplication is needed, because bars have a width, while points are considered not having a width.

    import matplotlib.pyplot as plt
    import seaborn as sns
    
    tips = sns.load_dataset('tips')
    plt.figure(figsize=(15, 8))
    ax1 = sns.barplot(x="day", y="total_bill", hue="sex", errorbar=None, data=tips, palette='spring')
    ax2 = ax1.twinx()
    h = len(tips['sex'].unique())
    sns.pointplot(x="day", y='tip', data=tips, hue="sex", dodge=0.8*(h-1)/h, palette='winter', legend=False, ax=ax2)
    
    plt.show()
    

    aligning sns.pointplot with hue

    Your case, connecting hues of the same x-values

    If I understand correctly, instead of having lines connecting the same hue value for different x's, you want the lines to connect the different hue values for each x.

    You can first use sns.pointplot to draw the lines connecting the hues of different x's. And then extract the values and positions. Then calculate a new ordering, to draw the desired lines. The old lines need to be removed.

    import matplotlib.pyplot as plt
    import seaborn as sns
    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({'x': [*'ABC'] * 4,
                       'class': np.tile([*'WXYZ'], 3),
                       'ratio': np.random.rand(12),
                       'cnt': np.random.randint(10, 30, 12)})
    
    plt.figure(figsize=(15, 8))
    ax1 = sns.barplot(x="x", y="ratio", hue="class", alpha=0.6, data=df)
    ax2 = ax1.twinx()
    # draw a regular pointplot aligned with the bars, use errorbar=None because those would be extra lines
    num_hues = len(df["class"].unique())
    sns.pointplot(x="x", y='cnt', data=df, hue="class", dodge=0.8 * (num_hues - 1) / num_hues,
                  errorbar=None, legend=False, ax=ax2)
    
    # extract the x and y positions, converting them to 1d arrays
    xs = np.array([line.get_xdata() for line in ax2.lines]).ravel()
    ys = np.array([line.get_ydata() for line in ax2.lines]).ravel()
    # get the left to right order of the x-values
    x_order = np.argsort(xs)
    # remove the lines of the point plot
    for line in ax2.lines[::-1]:
        line.remove()
    # plot the line connecting the points in left to right order
    ax2.plot(xs[x_order], ys[x_order], ls='-', marker='o', color='crimson', label='counts')
    # add the line to the legend of ax1
    handles1, labels1 = ax1.get_legend_handles_labels()
    handles2, labels2 = ax2.get_legend_handles_labels()
    ax1.legend(handles1 + handles2, labels1 + labels2)
    
    plt.show()
    

    sns.pointplot connecting hues belonging to the same x

    Note that this kind of plot isn't supported in standard Seaborn, as it looks quite confusing.