Search code examples
opencvcomputer-visionface-detectionfeature-selectionviola-jones

How often and where is each feature used in the Viola-Jones detector?


This is a question about the Viola-Jones Algorithm (used for face detection) as described here

http://en.wikipedia.org/wiki/Viola%E2%80%93Jones_object_detection_framework

and in the original paper

http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.110.4868

4 Features

My questions are:

  1. They are describing 3 kinds of features. And the give 4 examples of the features. So so many features are they calculation per 24x24 window? 3 or 4 ? Or are they using every possible size of these 4 features? (which would be quite a lot)
  2. Obviously one of the features can appear in different positions of that 24x24 window. So how many times and in what exact positions?
  3. They are describing 3 kinds of classifiers, but obviously they can be modified a lot (like A rotated is B). Flipping or inverting classifier D would also make sense. Are they using only these 4 types or are they modifying all of them in many ways?

Solution

  • One way to answer this is to look in opencv/apps/traincascade/haarfeatures.h and opencv/apps/traincascade/haarfeatures.cpp.

    1. In CvHaarEvaluator::generateFeatures(), the features are calculated for all possible rectangles that will fit in the given window size. So yes there are a lot.

    2. The features are generated in all positions in the window where they will fit, and therefore as many times as possible.

    3. Flipping or inverting a feature would only change its sign and would provide no addition information, so that is not done. Rotated features are not used because they could not be efficiently calculated using integral images. However, "tilted" (by 45 degrees) features can optionally be generated - see Lienhart and Maydt (2002) for details.

    Also the OpenCv doco shows all the features, you will see that there are some that are not in the Viola Jones paper. The BASIC option uses just the Viola Jones features.