I have a project where I need to find an algorithm that can solve the following problem:
Having three list of items :
A = [1,2,3,4,5]
B = [1,2,3,4,5]
C = [1,2,3,4,5]
With python I can find all unique combinations via this line of code:
allCombinations = list(set(product(A,B,C)))
But now i need to get from all of those combinations, the combinations that follow a pretty linear repartition.
for example, there are 125 unique combinations, and now I want 50 combinations where A1 B1 C1 appear less than A2 B2 C2 ... (if it can be almost linear, it will be perfect)
I have no idea how to solve this kind of problem, how can I select the best combinations that correspond to my thinking.
I can do it handly with 125 combinations, but for more it's too difficult.
Thanks
#Edit
I'll remake the example here.
A=[1,2]
B=[1,2]
C=[1,2]
the combinations from this list are
(1,1,1) (1,2,1) (1,2,2) (1,1,2) (2,1,1) (2,1,2) (2,2,1) (2,2,2)
If i need to select 3 combinations, i will choose (2,2,2) (1,2,2) (2,2,1) because i want to make 1 for A,B,C list fewer than 2 from A,B,C.
The goal is to produce rarity, A,B,C represents three list of items. Make the first item of the three list more rare than the second.
And i want to do it for a lot of items.
I think your problem is a little under-specified, so you have a choice to make as to how exactly you want to weight your combinations.
One possibility is to choose random combinations, but with a weight of i*j*k
attributed to combination [A[i],B[j],C[k]]
. So for instance, combination [A2,B2,C2]
will be 8 times more likely to be chosen as combination [A1,B1,C1]
.
We can use random.sample
to sample with weights: https://docs.python.org/3/library/random.html#random.sample
Python 3.9:
import itertools # product
import random # sample
def sampleCombinations(A, B, C, k):
allCombinations = list(itertools.product(enumerate(A), enumerate(B), enumerate(C)))
weights = [(i+1) * (j+1) * (k+1) for (i,_), (j,_), (k,_) in allCombinations]
sampled = random.sample(allCombinations, k, counts=weights)
sampled_clean = [(x,y,z) for (_,x), (_,y), (_,z) in sampled]
return sampled_clean
print(sampleCombinations(['A1','A2','A3','A4','A5'], ['B1','B2','B3','B4','B5'], ['C1','C2','C3','C4','C5'], 50))
print(sampleCombinations([1, 2], [1, 2], [1, 2], 3))
Note the use of enumerate
to get the indices i,j,k
that are needed to compute the weights. Then we don't forget to remove the indices in sampled_clean
before returning the combinations. Also note the weights are computed as (i+1)*(j+1)*(k+1)
rather than i*j*k
, because everything is 0-indexed, not 1-indexed.
Note: the "counts" keyword argument of random.sample is new in python 3.9. Prior to version 3.9, it was necessary to manually duplicate elements in the population to simulate the weights.
Python < 3.9:
import itertools # product
import random # sample
def sampleCombinations(A, B, C, k):
allCombinations = list(itertools.product(enumerate(A), enumerate(B), enumerate(C)))
weights = [(i+1) * (j+1) * (k+1) for (i,_), (j,_), (k,_) in allCombinations]
weightedCombinations = [c for c,w in zip(allCombinations, weights) for _ in range(w)]
sampled = random.sample(weightedCombinations, k)
sampled_clean = [(x,y,z) for (_,x), (_,y), (_,z) in sampled]
return sampled_clean
print(sampleCombinations(['A1','A2','A3','A4','A5'], ['B1','B2','B3','B4','B5'], ['C1','C2','C3','C4','C5'], 50))
# [('A3', 'B4', 'C2'), ('A4', 'B4', 'C5'), ('A2', 'B5', 'C5'), ('A4', 'B4', 'C4'), ('A3', 'B1', 'C4'), ('A4', 'B3', 'C3'), ('A4', 'B4', 'C2'), ('A5', 'B3', 'C4'), ('A2', 'B5', 'C3'), ('A5', 'B2', 'C2'), ('A5', 'B4', 'C3'), ('A4', 'B3', 'C1'), ('A3', 'B2', 'C5'), ('A2', 'B5', 'C5'), ('A4', 'B5', 'C5'), ('A5', 'B5', 'C5'), ('A3', 'B4', 'C5'), ('A3', 'B4', 'C5'), ('A5', 'B4', 'C2'), ('A2', 'B3', 'C1'), ('A2', 'B5', 'C2'), ('A3', 'B4', 'C4'), ('A4', 'B5', 'C1'), ('A3', 'B2', 'C2'), ('A4', 'B3', 'C5'), ('A2', 'B3', 'C3'), ('A3', 'B4', 'C1'), ('A5', 'B5', 'C4'), ('A3', 'B5', 'C5'), ('A3', 'B2', 'C5'), ('A5', 'B5', 'C3'), ('A5', 'B5', 'C3'), ('A3', 'B4', 'C4'), ('A4', 'B1', 'C1'), ('A3', 'B3', 'C4'), ('A4', 'B2', 'C5'), ('A5', 'B5', 'C5'), ('A4', 'B4', 'C3'), ('A1', 'B5', 'C3'), ('A4', 'B5', 'C3'), ('A4', 'B4', 'C2'), ('A5', 'B2', 'C2'), ('A5', 'B2', 'C5'), ('A4', 'B3', 'C5'), ('A4', 'B5', 'C1'), ('A4', 'B3', 'C5'), ('A5', 'B5', 'C5'), ('A3', 'B5', 'C3'), ('A5', 'B4', 'C5'), ('A3', 'B1', 'C4')]
print(sampleCombinations([1, 2], [1, 2], [1, 2], 3))
# [(2, 2, 2), (2, 2, 2), (1, 1, 1)]