Search code examples
pythonalgorithmperformancedictionaryhashtable

Why items method for dictionary is so much faster than a straightforward iteration?


I'm practicing on LeetCode and came across this problem. I'm just learning algorithms and data structures in python, so don't know much about how some built-in methods are implemented and so my initial attempt was (python code)

return [num for num in Counter(nums) if Counter(nums)[num] == 1]

but it would take the entire 12.82 seconds to run on this testcase. On the other hand, there is a very similar one liner in discussions

return [c[0] for c in Counter(nums).items() if c[1] == 1]

which utilizes the same idea but runs much faster, the aforementioned testcase took only 0.0022 seconds. Why is there such a difference in runtime?


Solution

  • The issue with this code:

    return [num for num in Counter(nums) if Counter(nums)[num] == 1]
    

    is that for each num in Counter(nums), you have to create a new Counter(nums) to determine if the if condition is true or false. By using Counter(nums).items() in the second version:

    return [c[0] for c in Counter(nums).items() if c[1] == 1]
    

    You have access to the number and its count already so there is no need to re-compute the count for each number.

    Note I would write the comprehension as

    return [num for num, count in Counter(nums).items() if count == 1]
    

    to make it a bit more obvious how it works.