Search code examples
algorithmmathrandomnormal-distribution

Transforming data so that the range around the median is more accurate


Suppose I have some floats that are normal distributed around 0. I need to serialize this into a uint8, but I would like to "give more" of the uint8 to the center of the distribution, and lose resolution around the edges.

For example: 127 would correspond to 0.0 and 255 to 1.0. But 191 would not be 0.5 — instead, it would be something like 0.3 because we're stretching it so that most of the numbers correspond to values near 0.

In practice, I'm actually going to have a random uint32 being generated and converted to an float. But when testing a linear mapping, the extremes (near -1.0 and 1.0) came up too frequently, and I'd like to center this around 0.0.

I'm aware that I can use the Box–Muller transform, but that's actually not suitable here because:

  1. We can cap out at -1.0 and 1.0, no need to have an unbounded output.

  2. We only have one number to sample from, not two.

Thanks


Solution

  • The quantile function (also known as the inverse CDF) maps uniform random numbers in [0, 1] to numbers that follow a distribution (such as the normal distribution).

    However, in the case of the normal distribution there are certain things to know (call the quantile function Q(u) from now on):

    • A quantile function ranges from 0 through 1, not from -1 to 1 or from 0 through 255.
    • The normal distribution can take on any real number. And in fact, for this distribution, Q(0) and Q(1) will equal infinity.
    • The normal distribution's quantile involves the inverse error function. The quantile may or may not be easy to implement depending on whether your programming environment already has an inverse error function available.
    • For the reasons above, whenever the input to the quantile function could be 0 or 1, you will have to scale the quantile function to fit your desired range and avoid infinity, say, from [0.001, 0.999] to [0, 255]. An example in pseudocode is below.
     for k in 0..255
        c=0.001+(0.999-0.001)*(k*1.0/256)
        print([k, Q(c)]) // print the uint8 value followed by the quantile
     end