Search code examples
pythonfrequency

How to calculate the frequency of items sold per client?


I am trying to calculate the frequency of items sold for each client of my dataset BUT I don't want to calculate the frequency on the length of the whole dataset but on the total number of purchased items per client.

My dataframe would look like this:

data = {'ClientId': ['1','2','3','4','2','2','1','4'],
        'QuantitySold': ['5','10','6','7','5','10','8','7']
       }

Expected output:

Client Id     QuantitySold     FrequencySold
1             5                0.385
2             10               0.4
3             6                1
4             7                0.5
2             5                0.2
2             10               0.4
1             8                0.615
4             7                0.5

Calculation explained: for client 1 = 5/(5+8)= 0.385

How can I do that using Python?


Solution

  • First, create a dictionary with the totals for each client, then just divide the current quantity by those totals:

    import collections
    totals = collections.defaultdict(int)
    for c, q in zip(data["ClientId"], data["QuantitySold"]):
        totals[c] += int(q)
    # defaultdict(int, {'1': 13, '2': 25, '3': 6, '4': 14})
    
    for c, q in zip(data["ClientId"], data["QuantitySold"]):
        print(c, q, int(q)/totals[c])
    

    Output:

    1 5 0.38461538461538464
    2 10 0.4
    3 6 1.0
    4 7 0.5
    2 5 0.2
    2 10 0.4
    1 8 0.6153846153846154
    4 7 0.5