Search code examples
pythonnumpyprobability

How to sample random variable from a different sized probability distribution


I'm trying to do this

np.random.choice(a, 1, p=p)

where len(a) != len(p)

Could you point me in the direction where to look for how to resize the probability distribution "p"? The idea is to keep the same distribution but over a different number of variables.

EDIT: Basically this (https://en.wikipedia.org/wiki/Scale_parameter) but with discrete variable.

I think that the interpolation is the way to go as suggested by Ryan Sander. I am using a neural network to output the policy distribution over an environment action space. I'm trying to train the network on multiple environments with different action space sizes. For example the network is outputting the distribution over a action space of size 6 (actions [0,1,2,...5], 6 numbers summing up to 1) and I'm trying to sample this distribution over an action space of size 9. Or the other way around.

The problem with interpolation is that the values that I get do not sum up to 1. If I do softmax on those values, the distribution that I get does not have the same(ish) shape as the original.


Solution

  • A couple suggestions:

    1. If you want to "interpolate" your probability distribution across the new values, you can do so using np.interp, i.e. using the example below:
    # Set parameters for interpolation
    xp = <VALS THAT P IS CURRENTLY OVER>
    x = <VALS YOU WANT P TO BE OVER>
    
    # Now interpolate
    p_interp = numpy.interp(x, xp, p)
    
    1. If you want to simply sample from the same variables as before (i.e. use the exact same probability distribution), you can use np.pad. You will probably want to specify different values for the left and right sides, depending on where the values you sample from in p fit in with the values in a.
    # Value to pad by (on both sides)
    pad_width_left = 5   # Padding on lefthand side
    pad_width_right = 3  # Padding on righthand side
    
    # Now pad vector
    p_padded_left = np.pad(p, pad_width_left)[:-pad_width_left]
    p_padded_right = np.pad(p_padded_left, pad_width_right)[pad_width_right:]