Search code examples
machine-learninglogistic-regressionvowpalwabbit

vowpalwabbit strange features count


I have found that during training my model vw shows very big (much more than my features count ) feature number count in it's log.

I have tried to reproduce it using some small example:

simple.test:

-1 | 1 2 3
1  | 3 4 5

then "vw simple.test" command says that it have used 8 features. +one feature is constant but what are the other ? And in my real exmaple difference between my features and features used in wv is abot x10 more.

....

Num weight bits = 18
learning rate = 0.5
initial_t = 0
power_t = 0.5
using no cache
Reading datafile = t
num sources = 1
average    since         example     example  current  current  current
loss       last          counter      weight    label  predict features

finished run
number of examples = 2
weighted example sum = 2
weighted label sum = 3
average loss = 1.9179
best constant = 1.5
total feature number = 8 !!!!

Solution

  • total feature number displays a sum of feature counts from all observed examples. So it's 2*(3+1 constant)=8 in your case. The number of features in current example is displayed in current features column. Note that only 2^Nth example is printed on screen by default. In general observations can have unequal number of features.