Trying to remove stopwords from csv file that has 3 columns and creates a new csv file with the removed stopwords. This is successful however, the data in the new file appears across the top row rather than the columns in the original file.
import io
import codecs
import csv
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize
stop_words = set(stopwords.words('english'))
file1 = codecs.open('soccer.csv','r','utf-8')
line = file1.read()
words = line.split()
for r in words:
if not r in stop_words:
appendFile = open('stopwords_soccer.csv','a', encoding='utf-8')
appendFile.write(" "+r)
appendFile.close()
You need to insert a newline character after writing each line.
for r in words:
if not r in stop_words:
appendFile = open('stopwords_soccer.csv','a', encoding='utf-8')
appendFile.write(r)
appendFile.write("\n")
appendFile.close()
This should solve your issue.