I am trying to find out the better text cleaning method for Dutch NLP problem. I have used dutch version for pos tags and nltk for removal of stop words. But I am not getting desired results.
have you tried this approach for dutch ?
from nltk.util import ngrams
from nltk.corpus import alpino
print(alpino.words())
quadgrams=ngrams(alpino.words(),4)
for i in quadgrams:
print(i)