Search code examples
pythonlisthashsumcompare

Can sum() be used like a hash


I want to check whether the sum of two list is same or not.

lst1 = [1, 2, 3, 4]
lst2 = [0, 3, 3, 4]
if sum(lst1) == sum(lst2):
    return true

Here the sum returns true. If I hash the list I get different values but hashing is computationally expensive. I only want to check whether the elements in the two lists are same (only with the exception of 1 element). I am using Binary search technique of dividing the list and then checking the hash. if hashes are different I am checking if its more than once. But as I said hashing is computationally expensive. Also the order does matter here. Thank you


Solution

  • Can sum be used like a hash?

    Yes, the sum() function can be used as a hash for a list or tuple of integers.

    Is hashing is computationally expensive?

    Interestingly, hash() performs better than sum() for a tuple of integers:

    $ python2.7 -m timeit -s "data = (1, 2, 3, 4)" "sum(data)"
    10000000 loops, best of 3: 0.113 usec per loop
    
    $ python2.7 -m timeit -s "data = (1, 2, 3, 4)" "hash(data)"
    10000000 loops, best of 3: 0.0569 usec per loop
    

    How to compare unordered lists

    The stated problem is "how to compare elements in the two lists are same (only with the exception of 1 element)"

    A built-in tool for doing this is collections.Counter(). Internally, it uses fast hashing to accumulate counts. In Python 3, it is accelerated by code written in C. It uses a single O(n) pass rather than an O(log n) binary search.

    >>> from collections import Counter
    >>> list1 = [1, 2, 3, 4]
    >>> list2 = [0, 3, 3, 4]
    
    >>> c1 = Counter(list1)
    >>> c2 = Counter(list2)
    >>> c1 == c2
    False
    >>> c1 - c2
    Counter({1: 1, 2: 1})
    >>> c2 - c1
    Counter({0: 1, 3: 1})
    

    The difference in the various approaches will become more pronounced as the list size grows larger.