python pandas nltk sentiment-analysis stop-words

Unable to remove english stopwords from a dataframe

I have been trying to perform sentiment analysis over a movie reviews dataset and I am stuck at a point where I am unable to remove english stopwords from the data. What am I doing wrong?

from nltk.corpus import stopwords
stop = stopwords.words("English")
list_ = []
for file_ in dataset:
    dataset['Content'] = dataset['Content'].apply(lambda x: [item for item in x.split(',') if item not in stop])
    list_.append(dataset)
dataset = pd.concat(list_, ignore_index=True)

Solution

I think the code should work with information so far. The assumption I am making is with data has extra space while separated with comma. Below is the test ran: (hope it helps!)

import pandas as pd
from nltk.corpus import stopwords
import nltk

stop = nltk.corpus.stopwords.words('english')

dataset = pd.DataFrame([{'Content':'i, am, the, computer, machine'}])
dataset = dataset.append({'Content':'i, play, game'}, ignore_index=True)
print(dataset)
list_ = []
for file_ in dataset:
    dataset['Content'] = dataset['Content'].apply(lambda x: [item.strip() for item in x.split(',') if item.strip() not in stop])
    list_.append(dataset)
dataset = pd.concat(list_, ignore_index=True)

print(dataset)

Input with stopwords:

                          Content
0   i, am, the, computer, machine
1                   i, play, game

Output:

                Content
 0  [computer, machine]
 1         [play, game]