Search code examples
pythonlistdictionarycounter

Check if dictionary items occurs "similarly"


I am trying to implement a function which checks whether a counter contains "similar" percentage of each items. That is

from collections import Counter

c = Counter(["Dog", "Cat", "Dog", "Horse", "Dog"])
size = 5
lst = list(c.values())
percentages = [x / size * 100 for x in lst]  # [60.0, 20.0, 20.0]

How can I check whether those percentages are all "similar"? I would like to apply the math.isclose method with abs_tol=2 but it takes two arguments not the entire list.

In the example, items do not occurs similarly.

This method will be used for checking whether a training set of labels is balanced or not.


Solution

  • One way is to pick the minimum and maximum value of the percentages list and pass those to isclose()

    from math import isclose
    from collections import Counter
    
    
    def is_balanced(lst, abs_tol):
        c = Counter(lst)
        total = c.total()
        percentages = [(v / total) * 100 for v in c.values()]
        return isclose(min(percentages), max(percentages), abs_tol=abs_tol)
    
    
    lst1 = ["Dog", "Cat", "Dog", "Horse", "Dog"]
    lst2 = ["Dog", "Cat", "Horse"]
    
    print(is_balanced(lst1, 2))  # False
    print(is_balanced(lst2, 2))  # True