In the hopes of developing an application that is able to detect specific hand positions (or hand symbols) in real time, my team and I stumbled upon haar classification a few months ago. We thought this would be the ideal tool for the job. We are however experiencing difficulties while trying to create our own classifiers (we are using OpenCV). They are not capturing the object of interest a large percentage of the time (see second question below).
I have a two questions on the subject:
Some sources that we found very helpful were:
and of course the opencv cascade training page.
I appreciate any help on the matter. Many thanks!
First a question. How much in-plane rotation is there in the hand gestures you are trying to detect? A cascade detector is not rotationally invariant. That means that if your hand gestures can be tilted to the left or to the right by more than about 10 degrees, you would not be able to detect them. The only solution there would be to rotate the image and try detecting again.
Now some pointers:
Edit: opencv_traincascade, which replaces haartraining supports HOG features. Alternatively, there is a trainCascadeObjectDetector function in the Computer Vision System Toolbox for Matlab that does the same thing and gives you a nicer interface. LBP is slightly less accurate than Haar on some benchmarks, but it is much faster to train with and takes much less memory.
If you have a lot of variation in orientation you definitely need more data. You also need to understand the range of possible rotations. Can your signs be upside down? Can they be rotated by 90 degrees? If your range is 30 degrees, maybe you can try 3 rotations of the image, or train 3 different detectors for each sign.
Also, if you use Haar features, you may benefit from enabling the 45-degree features. I think they are off by default.