I've recently found Textacy and as i go through the API reference guide I'm running into an error for the Vectorizer. If i add any options from the API reference I get a TypeError: unexpected keyword argument. I get this error for other options in addition to weighting.
I installed textacy using pip and I'm using Python3 on Ubuntu. Any help is appreciated. Thanks!
vectorizer = textacy.vsm.Vectorizer(weighting='tfidf')
TypeError: __init__() got an unexpected keyword argument 'weighting'
Ran into the same problem. The API documentation does not reflect the current Vectorizer keyword arguments. The Vectorizer now provides different keyword arguments to allow more control over how TF*IDF is applied.
vectorizer = textacy.Vectorizer(tf_type='linear', apply_idf=True, idf_type='smooth')
tf_type
applies standard term frequency (TF), apply_idf=True
applies the inverse document frequency (IDF). From the repo comments, idf_type='smooth'
adds one to each document frequency in order to avoid zero divisions.
To see more information about the options check the comment at line 182 in the repository here: https://github.com/chartbeat-labs/textacy/blob/master/textacy/vsm/vectorizers.py