Search code examples
pythonlisttuplessimilarity

Check similarity between elements in tuple Python


I have the following list:

list= [(12.947999999999979,5804),(100000.0,1516),(12.948000000000008,844),(12.948000000000036,172),(18.252000000000066,92)]

The first element in the tuple represents the value whereas the second element of the tuple represents the frequency that this value appears in a document. My question is how can I cluster the similar elements of the list (such as the 1st element and the 3rd element of the list) and combine their frequencies?


Solution

  • Use a Counter element and round() to the number of decimals you want. By the way do NOT use the reserved word list:

    from collections import Counter
    
    l= [(12.947999999999979,5804),(100000.0,1516),(12.948000000000008,844),(12.948000000000036,172),(18.252000000000066,92)]
    precision = 3
    
    c = Counter()
    
    for value, times in l:
        c.update([round(value, precision)]*times)
    

    If you already have the data in a counter you could do this directly:

    from collections import Counter
    
    # data = Counter()  # This is the counter where you have the data
    precision = 3
    
    joined = Counter()
    
    for value, times in data.items():
        joined.update([round(value, precision)]*times)
    
    data = joined