Leanest way to compute term frequency without using the counter class on a bag ADT

I have some code that works well computing term frequency on a chosen list using the counter class import.

from collections import Counter

terms=['the', 'fox', 'the', 'quick', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']

tf = Counter(terms)

print(tf)

The existing code works great but I am wondering what would be the leanest way to achieve the same result strictly using a bag/multiset ADT without the help of the python counter class.

I have spent several days experimenting with code and looking on other forums without much success.

Solution

You can use a single dictionary comprehension:

terms=['the', 'fox', 'the', 'quick', 'fox', 'jumps', 'over', 'the', 'lazy', 'dog']
new_terms = {term:terms.count(term) for term in terms}

Output:

{'lazy': 1, 'over': 1, 'fox': 2, 'dog': 1, 'quick': 1, 'the': 3, 'jumps': 1}

using the multiset:

import itertools
import multiset
final_data = [multiset.Multiset(list(b)) for a, b in itertools.groupby(sorted(terms))]

Output:

[Multiset({'dog': 1}), Multiset({'fox': 2}), Multiset({'jumps': 1}), Multiset({'lazy': 1}), Multiset({'over': 1}), Multiset({'quick': 1}), Multiset({'the': 3})]