Search code examples
pythonalgorithmgeneratorcombinationsdynamically-generated

Combinations with constraints find a repartitions that goes from low to high


I have a project where I need to find an algorithm that can solve the following problem:

Having three list of items :

A = [1,2,3,4,5]
B = [1,2,3,4,5]
C = [1,2,3,4,5]

With python I can find all unique combinations via this line of code:

allCombinations = list(set(product(A,B,C)))

But now i need to get from all of those combinations, the combinations that follow a pretty linear repartition.

for example, there are 125 unique combinations, and now I want 50 combinations where A1 B1 C1 appear less than A2 B2 C2 ... (if it can be almost linear, it will be perfect)

I have no idea how to solve this kind of problem, how can I select the best combinations that correspond to my thinking.

I can do it handly with 125 combinations, but for more it's too difficult.

Thanks

#Edit

I'll remake the example here.

A=[1,2] 
B=[1,2] 
C=[1,2] 

the combinations from this list are

(1,1,1) (1,2,1) (1,2,2) (1,1,2) (2,1,1) (2,1,2) (2,2,1) (2,2,2)

If i need to select 3 combinations, i will choose (2,2,2) (1,2,2) (2,2,1) because i want to make 1 for A,B,C list fewer than 2 from A,B,C.

The goal is to produce rarity, A,B,C represents three list of items. Make the first item of the three list more rare than the second.

And i want to do it for a lot of items.


Solution

  • I think your problem is a little under-specified, so you have a choice to make as to how exactly you want to weight your combinations.

    One possibility is to choose random combinations, but with a weight of i*j*k attributed to combination [A[i],B[j],C[k]]. So for instance, combination [A2,B2,C2] will be 8 times more likely to be chosen as combination [A1,B1,C1].

    We can use random.sample to sample with weights: https://docs.python.org/3/library/random.html#random.sample

    Python 3.9:

    import itertools  # product
    import random     # sample
    
    def sampleCombinations(A, B, C, k):
      allCombinations = list(itertools.product(enumerate(A), enumerate(B), enumerate(C)))
      weights = [(i+1) * (j+1) * (k+1) for (i,_), (j,_), (k,_) in allCombinations]
      sampled = random.sample(allCombinations, k, counts=weights)
      sampled_clean = [(x,y,z) for (_,x), (_,y), (_,z) in sampled]
      return sampled_clean
    
    print(sampleCombinations(['A1','A2','A3','A4','A5'], ['B1','B2','B3','B4','B5'], ['C1','C2','C3','C4','C5'], 50))
    
    print(sampleCombinations([1, 2], [1, 2], [1, 2], 3))
    

    Note the use of enumerate to get the indices i,j,k that are needed to compute the weights. Then we don't forget to remove the indices in sampled_clean before returning the combinations. Also note the weights are computed as (i+1)*(j+1)*(k+1) rather than i*j*k, because everything is 0-indexed, not 1-indexed.

    Note: the "counts" keyword argument of random.sample is new in python 3.9. Prior to version 3.9, it was necessary to manually duplicate elements in the population to simulate the weights.

    Python < 3.9:

    import itertools  # product
    import random     # sample
    
    def sampleCombinations(A, B, C, k):
      allCombinations = list(itertools.product(enumerate(A), enumerate(B), enumerate(C)))
      weights = [(i+1) * (j+1) * (k+1) for (i,_), (j,_), (k,_) in allCombinations]
      weightedCombinations = [c for c,w in zip(allCombinations, weights) for _ in range(w)]
      sampled = random.sample(weightedCombinations, k)
      sampled_clean = [(x,y,z) for (_,x), (_,y), (_,z) in sampled]
      return sampled_clean
    
    print(sampleCombinations(['A1','A2','A3','A4','A5'], ['B1','B2','B3','B4','B5'], ['C1','C2','C3','C4','C5'], 50))
    # [('A3', 'B4', 'C2'), ('A4', 'B4', 'C5'), ('A2', 'B5', 'C5'), ('A4', 'B4', 'C4'), ('A3', 'B1', 'C4'), ('A4', 'B3', 'C3'), ('A4', 'B4', 'C2'), ('A5', 'B3', 'C4'), ('A2', 'B5', 'C3'), ('A5', 'B2', 'C2'), ('A5', 'B4', 'C3'), ('A4', 'B3', 'C1'), ('A3', 'B2', 'C5'), ('A2', 'B5', 'C5'), ('A4', 'B5', 'C5'), ('A5', 'B5', 'C5'), ('A3', 'B4', 'C5'), ('A3', 'B4', 'C5'), ('A5', 'B4', 'C2'), ('A2', 'B3', 'C1'), ('A2', 'B5', 'C2'), ('A3', 'B4', 'C4'), ('A4', 'B5', 'C1'), ('A3', 'B2', 'C2'), ('A4', 'B3', 'C5'), ('A2', 'B3', 'C3'), ('A3', 'B4', 'C1'), ('A5', 'B5', 'C4'), ('A3', 'B5', 'C5'), ('A3', 'B2', 'C5'), ('A5', 'B5', 'C3'), ('A5', 'B5', 'C3'), ('A3', 'B4', 'C4'), ('A4', 'B1', 'C1'), ('A3', 'B3', 'C4'), ('A4', 'B2', 'C5'), ('A5', 'B5', 'C5'), ('A4', 'B4', 'C3'), ('A1', 'B5', 'C3'), ('A4', 'B5', 'C3'), ('A4', 'B4', 'C2'), ('A5', 'B2', 'C2'), ('A5', 'B2', 'C5'), ('A4', 'B3', 'C5'), ('A4', 'B5', 'C1'), ('A4', 'B3', 'C5'), ('A5', 'B5', 'C5'), ('A3', 'B5', 'C3'), ('A5', 'B4', 'C5'), ('A3', 'B1', 'C4')]
    
    print(sampleCombinations([1, 2], [1, 2], [1, 2], 3))
    # [(2, 2, 2), (2, 2, 2), (1, 1, 1)]