I have a data set with around 60,000 rows. It is a purchase order where you do not have a unique ID. Sample data below.
36 40 41 42 43 45 46
38 39 48 50 51 57
41 59 62
63 66 67 68
74 75 76 77
In the above list each number is an item purchased. I need the following:
This should do it:
from collections import Counter
items = Counter()
with open('data_file.txt', 'r') as f:
for line in f:
items.update(line.split())
print("Total Unique Items: {0}".format(len(items)))
for item, count in items.most_common(5):
print("Item {0} was purchased {1} times".format(item, count))
Yes, it's that short :).