Search code examples
pythonopencvhaar-classifiercascade-classifier

Cascade Classifier HAAR LBP Advice


I am using OpenCV and python to train HAAR and LBP classifiers to detect white blood cells in video frames. Since the problem is essentially 2D it should be easier than developing other object classifiers and there is great consistency between video frames.

So far I have been using this tutorial:

http://coding-robin.de/2013/07/22/train-your-own-opencv-haar-classifier.html

This is an example frame from the video, where I am trying to detect the smaller bright objects: enter image description here

Positive Images: -> nubmer=60 -> filetype=JPG -> width = 50 -> height = 80 enter image description hereenter image description hereenter image description here

Negative Images: -> number= 600 -> filetype=JPG -> width = 50 -> height = 80 enter image description hereenter image description hereenter image description here

N.B. negative image were extracted as random boxes throughout all frames in the video, I then simply deleted any that I considered contained a cell i.e. a positive image.

Having set-up the images for the problem I proceed to run the classifier following the instructions on coding robin:

find ./positive_images -iname "*.jpg" > positives.txt

find ./negative_images -iname "*.jpg" > negatives.txt

perl bin/createsamples.pl positives.txt negatives.txt samples 1500  "opencv_createsamples -bgcolor 0 -bgthresh 0 -maxxangle 0.1 -maxyangle 0.1 maxzangle 0.1 -maxidev 40 -w 50 -h 80"

find ./samples -name '*.vec' > samples.txt

./mergevec samples.txt samples.vec

opencv_traincascade -data classifier -vec samples.vec -bg negatives.txt\
-numStages 20 -minHitRate 0.999 -maxFalseAlarmRate 0.5 -numPos 60\
-numNeg 600 -w 50 -h 80 -mode ALL -precalcValBufSize 16384\
-precalcIdxBufSize 16384

This throws an error:

Train dataset for temp stage can not be filled. Branch training terminated.

But if I try with different parameters the file 'cascade.xml' is generated, using both HAAR and LBP, changing the minHitRate and maxFalseAlarmRate.

To test the classifier on my image I have a python script

import cv2

imagePath = "./examples/150224_Luc_1_MMImages_1_0001.png"
cascPath = "../classifier/cascade.xml"

leukocyteCascade = cv2.CascadeClassifier(cascPath)
image = cv2.imread(imagePath)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
leukocytes = leukocyteCascade.detectMultiScale(
    gray,
    scaleFactor=1.2,
    minNeighbors=5,
    minSize=(30, 70),
    maxSize=(60, 90),
    flags = cv2.cv.CV_HAAR_SCALE_IMAGE
)
print "Found {0} leukocytes!".format(len(leukocytes))

# Draw a rectangle around the leukocytes
for (x, y, w, h) in leukocytes:
    cv2.rectangle(image, (x, y), (x+w, y+h), (0, 255, 0), 2)

cv2.imwrite('output_frame.png',image)

This is not finding the objects I want, when I have run it with different parameters sometimes it has found 67 objects other times 0, but not the ones that I am trying to detect. Can anyone help me adjust the code to find the objects correctly. Many thanks


Solution

  • Sometimes that error is thrown when the images you provide OpenCV are insufficient. Your "train dataset" is too small for it to keep going.

    Can you provide more positive images? I think 60 can work for some objects, but normally people provide way more than that.

    Another thing I noticed in your commands is that you are providing negative images the exact same size as your positive samples. What createsamples does is place your positives on your negatives, and uses the negatives as backgrounds, hence the commonly used **bg**.txt filename. If you can use negatives larger than your positives, I would definitely try that.

    Your leukocytes are blurry too in the image you posted. If they are blurry in your positives & samples and not your test detectMultiscale, then I doubt OpenCV will find them, at least it won't stick to them, so to speak. I see that that is the video you are using. You've probably done this, but make you hold the video still enough to OpenCV to work its magic!

    One other thing, I would begin with providing minimal parameters to these functions, until you know exactly what you're doing. The defaults are there for a reason: they work for most people. Make your samples and traincascade -w and -h parameters the same. Make them small. Make sure only the object you are trying to detect is in your positives. OpenCV can detect all objects larger than the dimensions you specify. This will help you determine whether you're having some other sort of problem by reducing the number of things that could be wrong with your detection.

    Things to try:

    1. Simplify: remove all of those extra parameters you're providing, even if they're defaults.
    2. Specify the same width and height to you functions that you know will be smaller than what you're attempting to detect
    3. Make sure your positives and samples are not blurry
    4. Provide more positives (and perhaps negatives too)
    5. One I didn't mention above: reduce your -minNeighbors to 2 or 3 just to see if it is detected them at all. If not you need to sort of start over.