Search code examples
feature-engineeringvowpalwabbit

Vowpal Wabbit: question on weight of interaction features


In VW, the format for feature namespaces is shown below:

Label [Tag]|Namespace Features |Namespace Features ... |Namespace Features Where:

Namespace=String[:Value]

and an example is:

1 1.0 |MetricFeatures:3.28 height:1.5 length:2.0 |Says black with white stripes |OtherFeatures NumberOfLegs:4.0 HasStripes

Notice that the |MetricFeatures namespace has a higher weight than 1 (3.28). Based on the above example, if I create some feature interactions, say between the M and the S namespaces with -q MS, does the new feature namespace that is the cross product of the two original ones have an importance weighting of 1 by default? Or would it inherit the product of the two importance Values (in this case 1*3.28 = 3.28)?

And is there a way to modify the weight of the feature interactions manually? E.g. say MetricFeatures has an importance weight of 1, can I have the features generated by the quadratic interaction of MetricFeaturesXSays have an importance weighting of x?


Solution

  • Currently there is no way to individually weight interactions.


    The namespace weight is processed at parse time, so when reading in the features of that namespace they are multiplied by the weight.

    This can be verified by using --audit:

    Num weight bits = 18
    learning rate = 0.5
    initial_t = 0
    power_t = 0.5
    using no cache
    Reading datafile = data.txt
    num sources = 1
    average  since         example        example  current  current  current
    loss     last          counter         weight    label  predict features
    0
        MetricFeatures^height:146807:4.92:0@0   MetricFeatures^length:38580:6.56:0@0    Says^black:100768:1:0@0 Says^with:163314:1:0@0  Says^white:106708:1:0@0Says^stripes:112832:1:0@0    OtherFeatures^NumberOfLegs:146847:4:0@0 OtherFeatures^HasStripes:229154:1:0@0   Constant:116060:1:0@0
    1.000000 1.000000            1            1.0   1.0000   0.0000        9
    
    finished run
    number of examples = 1
    weighted example sum = 1.000000
    weighted label sum = 1.000000
    average loss = 1.000000
    best constant = 1.000000
    best constant's loss = 0.000000
    total feature number = 9
    

    MetricFeatures^height:146807:4.92:0@0 -> 3.28 * 1.5 = 4.92