Search code examples
pythonmatplotlibvisualizationhistogramgraphing

Matplotlib - label each bin


I'm currently using Matplotlib to create a histogram:

enter image description here

import matplotlib
matplotlib.use('Agg')
import matplotlib.pyplot as pyplot
...
fig = pyplot.figure()
ax = fig.add_subplot(1,1,1,)
n, bins, patches = ax.hist(measurements, bins=50, range=(graph_minimum, graph_maximum), histtype='bar')

#ax.set_xticklabels([n], rotation='vertical')

for patch in patches:
    patch.set_facecolor('r')

pyplot.title('Spam and Ham')
pyplot.xlabel('Time (in seconds)')
pyplot.ylabel('Bits of Ham')
pyplot.savefig(output_filename)

I'd like to make the x-axis labels a bit more meaningful.

Firstly, the x-axis ticks here seem to be limited to five ticks. No matter what I do, I can't seem to change this - even if I add more xticklabels, it only uses the first five. I'm not sure how Matplotlib calculates this, but I assume it's auto-calculated from the range/data?

Is there some way I can increase the resolution of x-tick labels - even to the point of one for each bar/bin?

(Ideally, I'd also like the seconds to be reformatted in micro-seconds/milli-seconds, but that's a question for another day).

Secondly, I'd like each individual bar labeled - with the actual number in that bin, as well as the percentage of the total of all bins.

The final output might look something like this:

enter image description here

Is something like that possible with Matplotlib?

Cheers, Victor


Solution

  • Sure! To set the ticks, just, well... Set the ticks (see matplotlib.pyplot.xticks or ax.set_xticks). (Also, you don't need to manually set the facecolor of the patches. You can just pass in a keyword argument.)

    For the rest, you'll need to do some slightly more fancy things with the labeling, but matplotlib makes it fairly easy.

    As an example:

    import matplotlib.pyplot as plt
    import numpy as np
    from matplotlib.ticker import FormatStrFormatter
    
    data = np.random.randn(82)
    fig, ax = plt.subplots()
    counts, bins, patches = ax.hist(data, facecolor='yellow', edgecolor='gray')
    
    # Set the ticks to be at the edges of the bins.
    ax.set_xticks(bins)
    # Set the xaxis's tick labels to be formatted with 1 decimal place...
    ax.xaxis.set_major_formatter(FormatStrFormatter('%0.1f'))
    
    # Change the colors of bars at the edges...
    twentyfifth, seventyfifth = np.percentile(data, [25, 75])
    for patch, rightside, leftside in zip(patches, bins[1:], bins[:-1]):
        if rightside < twentyfifth:
            patch.set_facecolor('green')
        elif leftside > seventyfifth:
            patch.set_facecolor('red')
    
    # Label the raw counts and the percentages below the x-axis...
    bin_centers = 0.5 * np.diff(bins) + bins[:-1]
    for count, x in zip(counts, bin_centers):
        # Label the raw counts
        ax.annotate(str(count), xy=(x, 0), xycoords=('data', 'axes fraction'),
            xytext=(0, -18), textcoords='offset points', va='top', ha='center')
    
        # Label the percentages
        percent = '%0.0f%%' % (100 * float(count) / counts.sum())
        ax.annotate(percent, xy=(x, 0), xycoords=('data', 'axes fraction'),
            xytext=(0, -32), textcoords='offset points', va='top', ha='center')
    
    
    # Give ourselves some more room at the bottom of the plot
    plt.subplots_adjust(bottom=0.15)
    plt.show()
    

    enter image description here