Search code examples
pythonpandasmatplotlibstacked-bar-chart

how to add trendlines to stacked barcharts


I want to make a stacked bar from a python dataframe. I want to connect equivalent categories between bars with trendlines. BUT i want the trendlines to connect the upper and lower edges of each category (as can be easily done in Excel), not its mean value (as is the case in most answers to similar questions found on stack overflow).

Here is an example image (generated with Excel) of what I would want to achieve: enter image description here

How can this be best achieved?


edit: GitHub Copilot gives me the following suggestion which ALMOST does what i want:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

# Assuming df is your DataFrame and it's indexed properly
df = pd.DataFrame({
    'Category1': [1, 2, 3, 4],
    'Category2': [2, 3, 4, 5],
    'Category3': [3, 4, 5, 6]
}, index=['Bar1', 'Bar2', 'Bar3', 'Bar4'])

ax = df.plot(kind='bar', stacked=True)

# Calculate the cumulative sum for each category
df_cumsum = df.cumsum(axis=1)

# Iterate over each category
for i, category in enumerate(df.columns):
    # Get the y coordinates for the upper boundary of each "sub-bar"
    y_upper = df_cumsum[category].values
    # Get the y coordinates for the lower boundary of each "sub-bar"
    y_lower = y_upper - df[category].values
    # Get the x coordinates for each point
    x_coords = np.arange(len(df))
    # Plot the line connecting the upper boundaries
    plt.plot(x_coords, y_upper, marker='o')
    # Plot the line connecting the lower boundaries
    plt.plot(x_coords, y_lower, marker='o')

plt.show()

however as you can see in the resulting figure, it connects the middle of the upper edges of each categories sub-bar. How can i connect the left and right CORNERS of each categories sub-bar?

enter image description here

Additional Info: In my specific case, no NaN or negative values occur

Edit 2: I DO have values of zero though...


Solution

  • One approach might be to iterate over ax.patches and get the top right and left corners of each pair of bars, which respectively become the left and right coordinates of the line segment connecting the bars:

    for i in range(len(ax.patches)-1):
        if (i + 1) % len(df) != 0:
            left = ax.patches[i].get_corners()[2]
            right = ax.patches[i+1].get_corners()[3]
            ax.plot([left[0], right[0]], [left[1], right[1]], c='gray', lw=0.5)
    

    Output:

    enter image description here

    Importantly, this assumes no NaNs (or zero values) in your data. You also haven't specified what should happen for negative values.