I have a csv file where I wish to perform a sentiment analysis on this dataset containing survey data.
So far this is what I have tried (thanks to Rupin from a previous question!):
import csv
from collections import Counter
with open('myfile.csv', 'r') as f:
reader = csv.reader(f, delimiter='\t')
alist = []
iterreader = iter(reader)
next(iterreader, None)
for row in iterreader:
clean_rows = row[0].replace(",", " ").rsplit()
alist.append(clean_rows)
word_count = Counter(clean_rows)
mostWcommon = word_count.most_common(3)
print(mostWcommon)
The output is nearly okay, the only problem that I have is that Python is splitting in different rows of a list, hence I have something like this as my output:
I wish to split everything in one row so that I can have the real word frequency... Any suggestions?
Thanks!
You are creating a new Counter
for each row and printing only that result. If you want a total count, you can create the counter outside the rows loop and update it with data from each row:
import csv
from collections import Counter
with open('myfile.csv', 'r') as f:
reader = csv.reader(f, delimiter='\t')
alist = []
iterreader = iter(reader)
next(iterreader, None)
c = Conter()
for row in iterreader:
clean_rows = row[0].replace(",", " ").rsplit()
alist.append(clean_rows)
c.update(clean_rows)
mostWcommon = word_count.most_common(3)
print(mostWcommon)