Search code examples
pythonpandascombinationsanalyticsanalysis

Lottery analysis for learning


I'm trying to learn how to use the pandas library.

For the data source, I use the lottery combinations draws so far.

One of many tasks I'm trying to solve is to count the frequency of pairs of numbers in combinations.

I create a data frame from the list like this:

list = [
    [13, 14, 28, 30, 31, 37, 39],
    [7, 10, 12, 16, 21, 22, 33],
    ...,
    [1, 2, 7, 15, 25, 31, 33],
    [3, 6, 18, 21, 31, 34, 39]
]

df = pd.DataFrame(list)
print(df.head())

Output:

.   0   1   2   3   4   5   6
0   9  11  12  18  20  26  35
1  10  13  15  20  21  25  35
2   1   8  17  21  22  27  34
3  10  13  17  18  21  29  37
4   5   8  12  17  19  21  37

For example, as a result I want to get the sum of how much time tuples of two or three numbers appear together in combinations:

Pair  : Found n time in all combinations
9,23  : 33
11,32 : 26

Can you give me some directions or example how to solve this task, please?


Solution

  • Here is a simple solution using just modules from the standard library:

    from itertools import combinations
    from collections import Counter
    
    draws = [
        [13, 14, 28, 30, 31, 37, 39],
        [7, 10, 12, 16, 21, 22, 33],
        [1, 2, 7, 15, 25, 31, 33],
        [3, 6, 18, 21, 31, 34, 39]
    ]
    
    duos = Counter()
    trios = Counter()
    
    for draw in draws:
        duos.update(combinations(draw, 2))
        trios.update(combinations(draw, 3))
    
    print('Top 5 duos')
    for x in duos.most_common(5):
        print(f'{x[0]}: {x[1]}')
    
    print()
    
    print('Top 5 trios')
    for x in trios.most_common(5):
        print(f'{x[0]}: {x[1]}')
    

    The code snippet above will result in the following output:

    Top 5 duos
    (31, 39): 2
    (7, 33): 2
    (13, 14): 1
    (13, 28): 1
    (13, 30): 1
    
    Top 5 trios
    (13, 14, 28): 1
    (13, 14, 30): 1
    (13, 14, 31): 1
    (13, 14, 37): 1
    (13, 14, 39): 1
    
    

    And here is a slightly more elegant version:

    from itertools import combinations
    from collections import Counter
    
    draws = [
        [13, 14, 28, 30, 31, 37, 39],
        [7, 10, 12, 16, 21, 22, 33],
        [1, 2, 7, 15, 25, 31, 33],
        [3, 6, 18, 21, 31, 34, 39]
    ]
    
    counters = [Counter() for _ in range(3)]
    
    for n, counter in enumerate(counters, 2):
        for draw in draws:
            counter.update(combinations(draw, n))
    
        print(f'Top 10 combos of {n} numbers')
    
        for combo, count in counter.most_common(10):
            print(' '.join((f'{_:2d}' for _ in combo)), count, sep=': ')
    
        print()
    

    Which will give us the following output:

    Top 10 combos of 2 numbers
    31 39: 2
     7 33: 2
    13 14: 1
    13 28: 1
    13 30: 1
    13 31: 1
    13 37: 1
    13 39: 1
    14 28: 1
    14 30: 1
    
    Top 10 combos of 3 numbers
    13 14 28: 1
    13 14 30: 1
    13 14 31: 1
    13 14 37: 1
    13 14 39: 1
    13 28 30: 1
    13 28 31: 1
    13 28 37: 1
    13 28 39: 1
    13 30 31: 1
    
    Top 10 combos of 4 numbers
    13 14 28 30: 1
    13 14 28 31: 1
    13 14 28 37: 1
    13 14 28 39: 1
    13 14 30 31: 1
    13 14 30 37: 1
    13 14 30 39: 1
    13 14 31 37: 1
    13 14 31 39: 1
    13 14 37 39: 1