Search code examples
pythonnumpynormal-distribution

Creating a 'normal distribution' like range in numpy


I am trying to 'bin' an array into bins (similar to histogram). I have an input array input_array and a range bins = np.linspace(-200, 200, 200). The overall function looks something like this:

def bin(arr):
    bins = np.linspace(-100, 100, 200)
    return np.histogram(arr, bins=bins)[0]

So,

bin([64, 19, 120, 55, 56, 108, 16, 84, 120, 44, 104, 79, 116, 31, 44, 12, 35, 68])

would return:

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0])

However, I want my bins to be more 'detailed' as I get close to 0... something similar to an indeal normal distribution. As a result, I could have more bins (i.e. short ranges) when I am close to 0 and as I move out towards the range, the bins are bigger. Is it possible?

More specifically, rather than having equally wide bins in a range, can I have an array of range where the bins towards the centre are smaller than towards the extremes?

I have already looked at answers like this and numpy.random.normal, but something is just not clicking right.


Solution

  • Use the inverse error function to generate the bins. You'll need to scale the bins to get the exact range you want

    This transform works because the inverse error function is flatter around zero than +/- one.

    inverse error function

    from scipy.special import erfinv
    erfinv(np.linspace(-1,1))
    # returns: 
    array([       -inf, -1.14541135, -0.8853822 , -0.70933273, -0.56893556,
           -0.44805114, -0.3390617 , -0.23761485, -0.14085661, -0.0466774 ,
            0.0466774 ,  0.14085661,  0.23761485,  0.3390617 ,  0.44805114,
            0.56893556,  0.70933273,  0.8853822 ,  1.14541135,         inf])