I am trying to program a sarcasm detection model using sarcasm data set from Kaggle using Jupiter notebook. I have downloaded the dataset to my pc and have modified it as a list of dictionaries. the dictionary consists of three keys as article_link, is_sarcastic, and headline.
my code below gives the following error:
TypeError Traceback (most recent call last) in 7 tokenizer.fit_on_texts(sentences) 8 ----> 9 my_word_index=tokenizer.word_index() 10 11 print(len(word_index))
TypeError: 'dict' object is not callable
import os
import pandas
os.getcwd()
import json
os.chdir('C:/Users/IMALSHA/Desktop/AI content writing/Cousera Deep Neural Networks course/NLP lectures')
#loading data
with open('Sarcasm_Headlines_Dataset.json','r') as json_file:
data_set=json.load(json_file)
#defining lists
sentences=[]
labels=[]
urls=[]
for item in data_set:
sentences.append(item['headline'])
labels.append(item['is_sarcastic'])
urls.append(item['article_link'])
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
tokenizer=Tokenizer(oov_token="<oov>")
tokenizer.fit_on_texts(sentences)
word_index=tokenizer.word_index()
print(len(word_index))
print(word_index)
sequences=tokenizer.texts_to_sequences(sentences)
paded=pad_sequences(sequences)
print(paded[2])
The problem is the following:
word_index=tokenizer.word_index()
Probably, you want to store tokenizer's word_index into word_index variable. Instead, you are calling tokenizer.word_index as if it was a method/function, but it is a dictionary.
So, I think that you have to apply the following correction:
word_index=tokenizer.word_index