I have a problem in this code, maybe anybody help, the data train['hadis'] from the text in excel, was success showing
train['hadis'] = train['hadis'].apply(lambda x: " ".join([nltk.tokenize.word_tokenize(x) for x in x.split()]))
train['hadis'].head()
TypeError: sequence item 0: expected str instance, list found
result for tokenizing every each row data
Instead of
lambda x: " ".join([nltk.tokenize.word_tokenize(x) for x in x.split()])
Use
lambda x: " ".join(nltk.tokenize.word_tokenize(x))