Search code examples
pythonoptimizationtime-complexitydata-sciencechess

What's the most efficient way to get frequency of occurrence of strings in a LARGE list?


I am analyzing chess games using Python. Currently I have a list of strings, containing ~400,000 elements. Each element is one of 64 possible strings. This is because each element denotes a square on the chess board, of which there are 64 ('a1', 'a2', ... , 'h7', 'h8').

What is the most efficient way of finding how many times each of the 64 elements occur in the entire list? I know sorting the list would make such a task quicker, but since I am dealing with strings and not integers I am not sure I can sort them. I do not mind using external modules, but I am looking for the most primitive and pythonic way here.

Any help is greatly appreciated!


Solution

  • Simply importing collections.Counter and passing the list to it would work just fine.

    >>> from collections import Counter
    >>> li = ['a1', 'b1', 'a1', 'c3', 'a1', 'b1', 'd5']
    >>> Counter(li)
    Counter({'a1': 3, 'b1': 2, 'c3': 1, 'd5': 1})