First of all I'm new to Python. what I'm trying to do is to lemmatize my data from a CSV. Used pandas to read the csv. But while running this I am getting an error on the line lemmatized.append(temp). It's saying NameError: name 'temp' is not defined I can't figure out what is causing this error. I am using python 2.7. I will be grateful if anyone of you python expert could help me out with this simple problem and thus help me in learning.
data = pd.read_csv('TrainingSETNEGATIVE.csv')
list = data['text'].values
def get_pos_tag(tag):
if tag.startswith('V'):
return 'v'
elif tag.startswith('N'):
return 'n'
elif tag.startswith('J'):
return 'a'
elif tag.startswith('R'):
return 'r'
else:
return 'n'
lemmatizer = WordNetLemmatizer()
with open('new_file.csv', 'w+', newline='') as myfile:
wr = csv.writer(myfile, quoting=csv.QUOTE_ALL)
for doc in list:
tok_doc = nltk.word_tokenize(doc)
pos_tag_doc = nltk.pos_tag(tok_doc)
lemmatized = []
for i in range(len(tok_doc)):
tag = get_pos_tag(pos_tag_doc[i][1])
if tag == 'r':
if tok_doc[i].endswith('ly'):
temp = tok_doc[i].replace("ly", "")
else:
temp = lemmatizer.lemmatize(tok_doc[i], pos=tag)
lemmatized.append(temp)
lemmatized = " ".join(lemmatized)
wr.writerow([lemmatized])
print(lemmatized)
The Exception says it all: "name 'temp' is not defined". So the variable temp
is not defined before it is used.
The problem with your code is here:
if tag == 'r':
if tok_doc[i].endswith('ly'):
temp = tok_doc[i].replace("ly", "")
# else: temp = None
else:
temp = lemmatizer.lemmatize(tok_doc[i], pos=tag)
lemmatized.append(temp)
If tag == 'r'
is True and tok_doc[i].endswith('ly')
is not True
then temp
never gets defined.
Consider adding an else
clause like the one I inserted and commented out.