Search code examples
pythonnumpyhistogram

Custom Range in numpy histogram


trying to output histogram data using numpy;

NUMBER_OF_PRICE_BRACKETS = 8
HISTOGRAM_EDGE_RANGE = (0, 1_000_000)

hist, bin_edges = numpy.histogram(price_list, bins=NUMBER_OF_PRICE_BRACKETS, range=HISTOGRAM_EDGE_RANGE)

I get the following output using above code

hist: [0, 6, 6, 0, 0, 0, 0, 0],
bin_edges: [0.0, 125000.0, 250000.0, 375000.0, 500000.0, 625000.0, 750000.0, 875000.0, 1000000.0]

The edges are automatically calculated. Is there any option to force the edges to be created like the example output like below?

hist: [0, 6, 6, 0, 0, 0, 0, 0]
bin_edges: [0.0, 100000.0, 150000.0, 300000.0, 450000.0, 600000.0, 750000.0, 900000.0, 1000000.0]

Maybe using range option like

range=(0, 1_000_000, 150)

Solution

  • You have two options, since histogram will always split your data into equally spaced bins, as if with

    np.linspace(*HISTOGRAM_EDGE_RANGE, NUMBER_OF_PRICE_BRACKETS + 1)
    

    Option 1: Supply the uneven bins manually:

    HISTOGRAM_EDGES = np.array([
        0, 150_000, 300_000, 450_000, 600_000,
        750_000, 900_000, 1_000_000])
    hist, bin_edges = numpy.histogram(price_list, bins=HISTOGRAM_EDGES)
    

    Option 2: Adjust your range so it does split evenly into the number of bins you want:

    NUMBER_OF_PRICE_BRACKETS = 8
    HISTOGRAM_EDGE_RANGE = (0, 1_050_000)
    
    hist, bin_edges = numpy.histogram(price_list, bins=NUMBER_OF_PRICE_BRACKETS, range=HISTOGRAM_EDGE_RANGE)