Search code examples
pythonpython-3.xmatplotliblegendconfidence-interval

Choosing elements for a legend in Python


I am using py.plot to display the upper and lower bounds of confidence intervals (shaded in between). I would like to label each pair of bounds with the item they are measuring. However, Python seems to only let me label the upper and lower boundary line. Is there a way I can customise the legend to label just the (i.e.) purple = item1, red = item 2, etc.?

Click here figure

Code below:

import pandas as pd
from matplotlib.pyplot import cm
import matplotlib.pyplot as plt
import numpy as np

item1 = pd.DataFrame([10,15,12], index = ['2018-01-01', '2018-01-02', '2018-01-03'])
item2 = pd.DataFrame([16,18,20], index = ['2018-01-01', '2018-01-02', '2018-01-03'])
item3 = pd.DataFrame([14,18,17], index = ['2018-01-01', '2018-01-02', '2018-01-03'])
dfs = [item1, item2, item3]
def compute(df):
      lower = []
      upper = []
      for i in range(len(df)):
           lower.append(df[0][i]-1)
           upper.append(df[0][i]+1)
      result = pd.DataFrame({'lower': lower , 'upper': upper}, index = ['2018-01-01', '2018-01-02', '2018-01-03'])
      return result

def run_function():
    color = iter(cm.rainbow(np.linspace(0, 1, len(dfs))))
    for i in range(len(dfs)):
        thing = compute(dfs[i])
        if i== 0:
            ax = thing.plot(color = next(color))
        if i != 0:
            thing.plot(ax=ax, color = next(color))
        plt.fill_between(thing.index, thing["upper"], thing["lower"], color="pink")
    plt.ylim(0,25)
    plt.show()

run_function()

When I use label = item1, Python seems to ignore it all together.


Solution

  • When you call thing.plot, you are calling pd.DataFrame.plot, which delegates to plt.plot with some additional stuff, including adding a label with the name of the column.

    By default, matplotlib's legend function determines the labels to show based on the plot elements that have a label attribute.

    You can solve your issue by adding a label to fill_between and possibly using the lower level plot:

    plt.fill_between(thing.index, thing["upper"], thing["lower"],
                     color="pink", label=f"Between {i}")
    

    To use low-level plot, set up a set of axes outside the loop:

    fig, ax = plt.subplots()
    

    Then replace the conditionals for plotting with the single

    ax.plot(thing['index'], thing['lower'], color=next(color), label=f"lower {i}")
    ax.plot(thing['index'], thing['upper'], color=next(color), label=f"upper {i}")