Search code examples
pythonmachine-learningnlpfasttext

how to change parameters of fasttext api in a python script


We have fasttext commands to run in command prompt

I have cloned the github repository and for example to change parameters of the network for a supervised learning in the command I used are like

 ./fasttext supervised -input FT_Race_data.txt -output race_model  -lr 0.4 -epoch 30 -loss hs

I am changing lr and epoch and loss. I can train and fetch the required output.

For programming in python script, I installed the fasttext library and I tried like

classifier = fasttext.supervised('FT_Race_data.txt','race_model') 

The model gets trained but the results are not good, In this case, I didn't define any parameters. So I tried like

classifier = fasttext.supervised('FT_Race_data.txt','race_model', 0.4, 30, 'hs')

The programs run with no error but don't give any result. So I tried like

classifier = fasttext.supervised(input = 'FT_Race_data.txt',output ='race_model', lr = 0.4,epoch= 30,loss = 'hs')

it gives an error that fasttext takes only two arguments.

How to change parameters in python script like in command prompt to fine tune the supervised learning ?


Solution

  • For future references, Form discussions here, it seems that the pip install fasttext doesn't install the full features available in the repo.

    So till when the latest features are included in https://pypi.python.org/pypi/fasttext, for python bindings with features to train models and set parameters, follow the following installation procedure as outlined here.

    git clone https://github.com/facebookresearch/fastText.git
    cd fastText
    pip install .
    

    And then using train_supervised a function which returns a model object one can set the different parameters as in the following example in that repo.

    fastText.train_supervised(input, lr=0.1, dim=100, ws=5, epoch=5, minCount=1, minCountLabel=0, minn=0, maxn=0, neg=5, wordNgrams=1, loss='softmax', bucket=2000000, thread=12, lrUpdateRate=100, t=0.0001, label='__label__', verbose=2, pretrainedVectors='')