Search code examples
vowpalwabbit

How to add a new class to resumed vowpal wabbit one-against-all logistic classifier?


I have a working logistic regression classifier using the one-against-all (oaa) method. Although I'm currently training the classifier to recognize 15 classes, in the future I would like to feed it examples from N additional classes that I would like my classifier to learn. However, vowpal wabbit commands using the --save_resume option do not allow me to use --oaa to specify a new total number of classes.

I use the oaa option because when I make predictions I want to select the top 3 predicted classes that have the highest probability of being true, which I determine using the --probabilities option.

How can I teach additional classes to my classifier when using --oaa and --save_resume?


I initially train my classifier using:

vw --oaa=15 --loss_function=logistic --save_resume -c --passes 10 -d /tmp/train.vw -f /tmp/model.vw

I resume training using:

vw --loss_function=logistic --save_resume -c --passes 10 -d /tmp/train.vw -i /tmp/model.vw -f /tmp/model.v

I make predictions using:

vw -t --probabilities --loss_function=logistic -d /tmp/test.vw -i /tmp/model.vw -p /tmp/predict.vw

I then examine predict.vw and select the classes with the top 3 highest probabilities of being true.


Solution

  • Currently, it is not possible to increase the number N of classes in --oaa N when training in multiple steps with --save_resume. Internally, the model uses N for offsetting the weight vector, so you would need to hack the loading of the model.

    You can try setting the N high enough from the beginning and using classes 1-15 in the first steps, and adding classes with higher numbers in the later steps. Thanks to the nature of online training the later examples influence the model more.

    Alternatively, with csoaa_ldf you can specify the number of classes on the fly: different classes may be available for each example.