Search code examples
pythonmemorymemory-managementbigdatakaggle

how to reduce memory usage in kaggle for python code


import itertools

deck = ['AD', '2D', '3D', '4D', '5D', '6D', '7D', '8D', '9D', '10D', 'JD', 'QD', 'KD', 
        'AC', '2C', '3C', '4C', '5C', '6C', '7C', '8C', '9C', '10C', 'JC', 'QC', 'KC',    
        'AH', '2H', '3H', '4H', '5H', '6H', '7H', '8H', '9H', '10H', 'JH', 'QH', 'KH',      
        'AS', '2S', '3S', '4S', '5S', '6S', '7S', '8S', '9S', '10S', 'JS', 'QS', 'KS']

combinations = list(itertools.combinations(deck, 9))

i try to find all this combinations then i will load this combinations in a csv file but kaggle gives me this error message:

Your notebook tried to allocate more memory than is available. It has restarted.


Solution

  • Don't use a list, just write line after line.

    itertools.combinations create an iterator that allows you to iterate over each value without having to create a list and store each value in memory.

    You can use the csv module to write each combination as a line. If you don't want an empty line between each combination, don't forget to use the newline='' in open: https://stackoverflow.com/a/3348664/6251742.

    import csv
    import itertools
    
    deck = ['AD', '2D', '3D', '4D', '5D', '6D', '7D', '8D', '9D', '10D', 'JD', 'QD', 'KD',
            'AC', '2C', '3C', '4C', '5C', '6C', '7C', '8C', '9C', '10C', 'JC', 'QC', 'KC',
            'AH', '2H', '3H', '4H', '5H', '6H', '7H', '8H', '9H', '10H', 'JH', 'QH', 'KH',
            'AS', '2S', '3S', '4S', '5S', '6S', '7S', '8S', '9S', '10S', 'JS', 'QS', 'KS']
    
    combinations = itertools.combinations(deck, 9)
    
    with open('combinations.csv', 'w', newline='') as file:
        writer = csv.writer(file, delimiter=',')
        for combination in combinations:
            writer.writerow(combination)
    

    Result after some time:

    AD,2D,3D,4D,5D,6D,7D,8D,9D
    AD,2D,3D,4D,5D,6D,7D,8D,10D
    AD,2D,3D,4D,5D,6D,7D,8D,JD
    AD,2D,3D,4D,5D,6D,7D,8D,QD
    AD,2D,3D,4D,5D,6D,7D,8D,KD
    ...  # 3679075395 more lines, 98.3 GB