Search code examples
pythonpandasmatplotlibhistogramannotate

Labelling a histogram bin with an arrow


I have a histogram plot which could be replicated with the MWE below:

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np

pd.Series(np.random.normal(0, 100, 1000)).plot(kind='hist', bins=50)

Which creates a plot like this:

example histogram

How would I then go about labelling the bin with an arrow for a given integer?

For example see below, where an arrow labels the bin containing the integer 300.

example with bin of integer labelled

EDIT: I should add ideally the y coordinates of the arrow should be set automatically by the height of the bar it is labelling - if possible!


Solution

  • you can use annotate to add an arrow:

    import pandas as pd
    import matplotlib.pyplot as plt
    #import seaborn as sns
    import numpy as np
    
    fig, ax = plt.subplots()
    series = pd.Series(np.random.normal(0, 100, 1000))
    series.plot(kind='hist', bins=50, ax=ax)
    ax.annotate("",
                xy=(300, 5), xycoords='data',
                xytext=(300, 20), textcoords='data',
                arrowprops=dict(arrowstyle="->",
                                connectionstyle="arc3"),
                )
    

    In this example, I added an arrow that goes from coordinates (300, 20) to (300, 5).

    In order to automatically scale your arrow to the value in the bin, you can use matplotlib hist to plot the histogram and get the values back and then use numpy where to find which bin corresponds to the desired position.

    import pandas as pd
    import matplotlib.pyplot as plt
    #import seaborn as sns
    import numpy as np
    
    nbins = 50
    labeled_bin = 200
    
    fig, ax = plt.subplots()
    
    series = pd.Series(np.random.normal(0, 100, 1000))
    
    ## plot the histogram and return the bin position and values
    ybins, xbins, _ = ax.hist(series, bins=nbins)
    
    ## find out in which bin belongs the position where you want the label
    ind_bin = np.where(xbins >= labeled_bin)[0]
    if len(ind_bin) > 0 and ind_bin[0] > 0:
        ## get position and value of the bin
        x_bin = xbins[ind_bin[0]-1]/2. + xbins[ind_bin[0]]/2.
        y_bin = ybins[ind_bin[0]-1]
        ## add the arrow
        ax.annotate("",
                    xy=(x_bin, y_bin + 5), xycoords='data',
                    xytext=(x_bin, y_bin + 20), textcoords='data',
                    arrowprops=dict(arrowstyle="->",
                                    connectionstyle="arc3"),
                                    )
    else:
        print "Labeled bin is outside range"