Mapping elements of a list to Intervals to create a cumulative probability distribution

I have a list of elements, for example L = [A, B, C]. Each element has an associated score for example S = [5, 1, 4].

I want to select an element from L according to its score in S, simply by generating a sort of cumulative probability distribution, where each element L[i] corresponds to an interval in (0,1] proportional to S[i]. Then, a random number drawn in (0,1] maps to the selected element.

For the example given before, we can have the scores of S represented as probabilities by normalizing on 5+1+4 so we get SS = [0.5, 0.1, 0.4], and we map the elements of L to intervals, such that:

B is mapped to (0, 0.1]
C is mapped to (0.1, 0.1+0.4]
A is mapped to (0.1+0.4, 0.5+0.5]

Now if I generate a random number r in (0,1] (e.g. r = random.random()), it will map to the corresponding element. For example if r = 0.03 we know that the element is B. And for instance, if r = 0.73 we know that the element is A ...

Is there a simple way in python to do this mapping between an element and an interval ?

I know that I can use numpy.cumsum to generate the cumulative sum of SS, but how to map elements to intervals obtained from this cumulative sum ?

Solution

Assuming I understand you correctly, I would recommend a dict:

def branchFinder(L, S, r):
    S, L = map(list, zip(*sorted(zip(S, L))))
    SS = [0.] + map(lambda x: x/10., S)

    probs = {}
    for i in range(len(L)):
        probs[L[i]] = (sum(SS[:i+1]), SS[i+1] + sum(SS[:i+1]))        

    for key, value in probs.iteritems():
        if value[0] < r <= value[1]:
             return key