python pandas numpy counter countvectorizer

Counting word frequency in original file and mapping them

I'm trying to use a modified version of count vectorizer where I use it to fit on a series. Then I get the sum of all the counts for values in cells. E.g: This is my series on which I'm fitting the count vectorizer.

["dog cat mouse", " cat mouse", "mouse mouse cat"]

The end result should look something like:

[1+3+4, 3+4, 4+4+3]

I've tried using Counter but it doesn't really work in this case. So far I've only been successful in getting a sparse matrix but that prints out the total number of elements in the cell. However I want to map the count to the entire series.

Solution

The items of the counter list can only be stored in the form of string, later a string can be evaluated using eval()

Code:

lst = ["dog cat mouse", " cat mouse", "mouse mouse cat"]
res = {}
res2 = []
for i in lst:
    for j in i.split(' '):
        if j not in res.keys():
            res[j] = 1
        else:
            res[j] += 1

for i in lst:
    res2.append('+'.join([str(res[j]) for j in i.split(' ')]))

print(res2)

The result (res2) should be like ['1+3+4', '3+4', '4+4+3']

I think this is what you want...