I feel like I have a dumb question, but here goes anyway.. I'm trying to go from data that looks something like this:
a word form lemma POS count of occurrance
same word form lemma Not the same POS another count
same word form lemma Yet another POS another count
to a result that looks like this:
the word form total count all possible POS and their individual counts
So for example I could have:
ring total count = 100 noun = 40, verb = 60
I have my data in a CSV file. I want to do something like this:
for row in all_rows:
if row[0] is the same as row[0] in the next row, add the values from row[3] together to get the total count
buuut I can't seem to figure out how to do that. Help?
If I understood correctly, the simplest way to achieve what you need would be:
# Mocked CSV data
data = [
['a', 'lemma', 'pos', 1],
['a', 'lemma', 'pos1', 2],
['a', 'lemma', 'pos2', 3],
['b', 'lemma', 'pos', 5],
]
result = {}
for row in data:
key = row[0]
count = row[3]
if key in result:
result[key] += count
else:
result[key] = count
print(result)
Result:
{
'a': 6,
'b': 5
}