python python-2.7 opencv object-detection haar-classifier

Haar classifier parameters tuning

I am trying to train a haar classifier that detects legos faces in images but I have really hard time tuning the parameters.

I took pictures of 3 legos (50 pictures each) and using opencv I isolated their heads as 40x40 images.

Sample image is the following:

Added to this, I just took 500 empty background images to serve as negative images in my dataset. I created the paths and produced the samples.vec file as described in the documentation of opencv.

After that, I tried to train my haar classifier. I used these parameters that I found on another similar project: opencv_traincascade -data classifier -vec samples.vec -bg negatives.txt\ -numStages 10 -minHitRate 0.999 -maxFalseAlarmRate 0.5 -numPos 1000\ -numNeg 600 -w 40 -h 40 -mode ALL -precalcValBufSize 1024\ -precalcIdxBufSize 1024

The results are really bad. The classifier recognizes legos where they don't exist and strangely it doesn't recognize the expected lego faces.

I am really in trouble tuning this classifier, because the parameters list is huge and I don't have any clue how to set values that will lead to an efficient classifier and do not take ages to train.

Any help would be appreciated, especially about how to choose the parameters and what is the expected training time in an "average" computer. Thank you for your time!

(p.s.: the training's duration was 2 hours, which I think is too fast and maybe it is the cause of the bad performance).

Solution

Have a look at my answer here - Generating good training data for haar cascades

If your training set is really the size of that picture you have posted then it is likely that 40x40 is just way to large for width and height. 2 Hours is OK for training, but at 0.999 I wouldn't expect it to get to 10 stages that quickly.

This is one of those problems where there isn't really a "right" answer. I would suggest higher resolution images though.

Answers to comment questions First comment - Train your classifier on the images you are going to be using, so if they are low res then stick to low res. It may just be a harder problem.

Second Comment - If you are only training in one scene, i.e. if you have a lego photo booth with a distinct background and you are only ever detecting int hat then use everything except the positive features. I.e. background. It becomes an easier CV problem, when I say negative is "everything else" I mean in terms of waht you are trying to detect. If you are trying to detect lego men walking around the streets of london then you will need a much larger negative set than if they are all on the same background. It may even help to make your background a distinct colour or something, im not sure.