I wonder how query-level features (such as term count in query) are useful? Because, query-level features are ignored while generating model file.
Train file;
3 qid:1 1:2 2:1 3:0 4:0.2 5:0
2 qid:1 1:2 2:0 3:1 4:0.1 5:1
1 qid:1 1:2 2:1 3:0 4:0.4 5:0
1 qid:1 1:2 2:0 3:1 4:0.3 5:0
1 qid:2 1:3 2:0 3:1 4:0.2 5:0
2 qid:2 1:3 2:0 3:1 4:0.4 5:0
1 qid:2 1:3 2:0 3:1 4:0.1 5:0
1 qid:2 1:3 2:0 3:1 4:0.2 5:0
In this file, 1st feature is query-level feature which is same across same query - different item pairs.
It has trained by SVM-rank. Then, the generated model file ignores 1st feature, and starts from 2nd feature.
Generated model file;
1 2:0.50956941 3:-0.50956941 4:0.1913875 5:1.0382775 #
Query level features might be helpful in a different ranking paradigm, but Joachims states that:
Note that ranks are comparable only between examples with the same qid. Note also that the target value (first value in each line of the data files) is only used to define the order of the examples. Its absolute value does not matter, as long as the ordering relative to the other examples with the same qid remains the same.
Which means that a feature which is constant within each query will never be used by the model. In order for such a feature to be helpful, your model would have to be doing some sort of comparison across qid
s.