Search code examples
pythonc++opencvcomputer-visionobject-detection

Different results from OpenCV in Python and C++ implementations for HOG object detection


I have implemented HOG face detector in Python and C++ using OpenCV. I tried to keep the code in both the implementations quite the same. However, I am getting different results in both. In Python, it works correctly, however, in C++ it is showing completely incorrect results. Below is an example of their outputs (first one in Python and second one in C++): enter image description here enter image description here

First, I trained an OpenCV linear SVM classifier for both the implementations and saved them in XML files (model files). Then, I extracted the coefficients (which are used for customizing a HOG detector during the testing process) from the models (XML files) using this code for Python implementation and this code for C++ implementation. That is, these coefficients are going to be the inputs for the function setSVMDetector(const std::vector< float > input_coefficients) during testing process.

Ideally, these coefficients should be the same since they are computed from the same dataset and using OpenCV. I have checked manually these coefficients values by saving them in the text files for both the implementations and found that they are nearly the same. So, I would expect my future customized HOG detector should work nearly the same in both the implementations.

Below are the test codes for detecting a face in a test image using both implementations.

Python implementation:

import cv2
im = cv2.imread("..\\test_imgs\\1.png", 0) # test image
hog = cv2.HOGDescriptor((96, 128), (16,16), (8,8), (8,8), 9)

coeffs = pickle.load(open("coeffs_from_model")) # load coeffs already computed from model
hog.setSVMDetector( np.array(coeffs)) # customize HOG detector

found, w = hog.detectMultiScale(im,  winStride=(8,8), padding=(32,32), scale=4.05)
draw_detections(im, found_filtered) # method for drawing BBs on image

C++ implementation:

cv::Mat im = cv::imread("..\\test_imgs\\1.png", 0);
cv::HOGDescriptor hog(cv::Size(96, 128), cv::Size(16, 16), cv::Size(8, 8), cv::Size(8, 8), 9);

LinearSVM svm; // check the link 
svm.load(model.c_str());

std::vector<float> coeffs;
svm.getSupportVector(coeffs); // compute coeffs from model

hog.setSVMDetector(coeffs); // customize HOG detector

std::vector<cv::Rect> found; // holds the detected BBs
hog.detectMultiScale(im, found, 0, cv::Size(8, 8), cv::Size(32, 32), 4.05); 
drawLocations(im, found, cv::Scalar(0, 255, 0)); // method for drawing BBs on the image. 

In order to check if the coefficients computed for the C++ are incorrect. I used them in the Python implementation and interestingly, they work quite the same. So, now I am not getting why the HOG object in the C++ implementation is not working correctly despite having the correct coefficients.

I have used the same initialization of the HOG object in C++ as I did for Python and keep the code nearly the same in both implementations since they both use the same OpenCV.


Solution

  • I get the answer now. Both code written above is correct. Problem lies on the labels used for the training. I was using "0" (face class) and "1" (non-face class) as the labels in both implementations.

    For Python implementation, this labellings are working correctly, however, for C++ implementation, it seems like it has to be "+1" (face class) and "-1" (non-face class) as I found it here. In that example, at the end of the source code, labels are provided as:

    // Set up training data
    float labels[4] = {1.0, -1.0, -1.0, -1.0};
    Mat labelsMat(4, 1, CV_32FC1, labels);