Search code examples
pythonpython-3.xprobability

How to select items from a list based on probability


I have lists a and b

a = [0.1, 0.3, 0.1, 0.2, 0.1, 0.1, 0.1]

b = [apple, gun, pizza, sword, pasta, chicken, elephant]

Now I want to create a new list c of 3 items

the 3 items are chosen form list b based on the probabilities in list a

the items should not repeat in list c

for example- output I am looking for

c = [gun,sword,pizza]

or

c = [apple, pizza, pasta]

note (sum of all values of list a is 1,number of items in list a and b is the same, actually i have a thousand items in both list a and b and i want to select hundred items from the list based on probability assigned to them,python3 )


Solution

  • Use random.choices:

    >>> import random
    >>> print(random.choices(
    ...     ['apple', 'gun', 'pizza', 'sword', 'pasta', 'chicken', 'elephant'], 
    ...     [0.1, 0.3, 0.1, 0.2, 0.1, 0.1, 0.1],
    ...     k=3
    ... ))
    ['gun', 'pasta', 'sword']
    

    Edit: To avoid replacement, you can remove the selected item from the population:

    def choices_no_replacement(population, weights, k=1):
        population = list(population)
        weigths = list(weights)    
        result = []
        for n in range(k):
            pos = random.choices(
                range(len(population)), 
                weights,
                k=1
            )[0]
            result.append(population[pos])
            del population[pos], weights[pos]
        return result
    

    Testing:

    >>> print(choices_no_replacement(
    ...     ['apple', 'gun', 'pizza', 'sword', 'pasta', 'chicken', 'elephant'],
    ...     [0.1, 0.3, 0.1, 0.2, 0.1, 0.1, 0.1],
    ...     k=3
    ... ))
    ['gun', 'pizza', 'sword']