Search code examples
nlptfidfvectorizer

How to use Bi-normal Separation for text in Python


I'm looking for a way to implement Bi-normal Separation with Sklearn. But I don't find any available solutions. I read Forman's article about the advantages of BNS Feature Scaling over TF-IDF.


Solution

  • You can use the code made for the article you mentioned at Github. You can find the code snippet as well as the number of examples of how to use BNS with sklearn SVM classifier, etc.

    You shall fit BNS before transforming the data, though the author skips this phase in the examples:

    X_bns = bns.transform(X) #change to 
    X_bns = bns.fit_transform(X)
    

    The code is written in Python 2. Make sure you change "iteritems()" to "items()" in bns.py.