I would like to apply the defined function "tokenization" to all rows of the column "Review Gast" of the dataset "reviews_english". How can i do that? Currenty i can only apply it to one row. Thanks! :)
def tokenization(text):
# Normalize
text = normalize(text)
# Remove Punctuation
text = remove_punctuation(text)
# Tokenize
tokens = text.split()
# Remove Stopwords
tokens = remove_stopwords(tokens)
# Apply Bag-of-Words (set of tokens)
bow = set(tokens)
return bow
clean_reviews_english =tokenization(reviews_english["Review Gast"][0])
print(clean_reviews_english)
Use a list comprehension
clean_reviews_english = tokenization(review for review in reviews_english["Review Gast"])
or map
:
clean_reviews_english = map(tokenization, reviews_english["Review Gast"])