python opencv svm histogram-of-oriented-gradients

How to use custom SVM detector with cv2.HOGDescriptor()?

I am following this tutorial and trying to use a custom SVM object detector instead of the cv2.HOGDescriptor_getDefaultPeopleDetector(). Following is the code for how to set an SVM detector:

hog = cv2.HOGDescriptor()
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

So no surprise, I tried to make use of it by putting my SVM detector here. Following is the relevant code from training:

# This problem still happens if I use the default parameters (not passing any parameters)
hog = cv2.HOGDescriptor(_winSize = (64, 64), 
                        _blockSize = (16, 16), 
                        _blockStride = (2, 2),
                        _cellSize = (8, 8),
                        _nbins = 9)

# All the image in positive and negative images are just the cropped bounding boxes 
# which are resized to (128, 128)
positive_features = np.array([hog.compute(img) for img in positive_images])
negative_features = np.array([hog.compute(img) for img in negative_images])

feature_matrix = np.concatenate((positive_features, negative_features), axis=0)
labels = np.concatenate((np.zeros(len(positive_features)), np.ones(len(negative_features))), axis=0) # This is counter-intuitive but opencv SVM seems to expect 0 for positive class and 1 for negative (when you wish to use it with cv2.detectMultiScale())

train_matrix = np.concatenate((feature_matrix, np.expand_dims(labels, 0).T), axis = 1)

np.random.seed(int(SEED))
np.random.shuffle(train_matrix)

feature_matrix = train_matrix[:, :-1]

labels = train_matrix[:, -1]

# OpenCV
feature_matrix = feature_matrix.astype(np.float32)  # Convert to 32-bit floating-point

labels = labels.astype(np.int32)  # Convert labels to 32-bit signed integer
model = cv2.ml.SVM_create()
model.setType(cv2.ml.SVM_C_SVC)
model.setKernel(cv2.ml.SVM_LINEAR)
model.setTermCriteria((cv2.TERM_CRITERIA_MAX_ITER, 100, 1e-6))

model.train(feature_matrix, cv2.ml.ROW_SAMPLE, labels)
model.save("model.svm")

During prediction:

# This problem still happens if I use the default parameters (not passing any parameters)
hog = cv2.HOGDescriptor(_winSize = (64, 64), 
                        _blockSize = (16, 16), 
                        _blockStride = (2, 2),
                        _cellSize = (8, 8),
                        _nbins = 9)
model = cv2.ml.SVM_load("model.svm")
support_vectors = model.getSupportVectors()
coefficients = -model.getDecisionFunction(0)[0]

coefficients = np.array(coefficients).reshape(1, -1)
svmdetector = np.concatenate((support_vectors, coefficients), axis=1)
hog.setSVMDetector(svmdetector.T.flatten())

But this gives me an exception:

    hog.setSVMDetector(svmdetector.T.flatten())
cv2.error: OpenCV(4.7.0) /io/opencv/modules/objdetect/src/hog.cpp:120: error: (-215:Assertion failed) checkDetectorSize() in function 'setSVMDetector'

I checked the source code of it and found this line is giving me the error. And this is the definition of the check function:

bool HOGDescriptor::checkDetectorSize() const
{
    size_t detectorSize = svmDetector.size(), descriptorSize = getDescriptorSize();
    return detectorSize == 0 ||
        detectorSize == descriptorSize ||
        detectorSize == descriptorSize + 1;
}

I couldn't understand any obvious reason why the error is happening. I also tried another model from OpenCV and could reproduce the error message. The following are the findings:

hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector()) # Works [shape: (3781,)]
hog.setSVMDetector(cv2.HOGDescriptor_getDaimlerPeopleDetector()) # Does't work [shape: (1981,)]

Solution

Anyway, after facing a significant hurdle, I was able to figure out the way to do that. But before I start writing the solution I want to share some pitfalls where you may fall throughout the process:

The detectorSize must be the same as the descriptorSize or it can be one more than that. See source.

bool HOGDescriptor::checkDetectorSize() const
{
    size_t detectorSize = svmDetector.size(), descriptorSize = getDescriptorSize();
    return detectorSize == 0 ||
        detectorSize == descriptorSize ||
        detectorSize == descriptorSize + 1;
}

If you check the implementation of getDescriptorSize() you will find that block size must be divisible by the cell size in either direction. Also stride must be taken in such a way that when you slide the block across the selected window it eventually can cover the whole window
The former two points are quite intuitive. But there is another thing you have to ensure. The _winSize must be the same as the image's size (positive and negative example images). Otherwise, for example, if the images you are using are bigger than that, then the cv2.HOGDescriptor will produce a larger feature vector which will again produce a larger number of parameters in the SVM model. So after training when you try to set the SVM model to a relatively lower number of features containing hog objects, then this error is produced (in the line hog.setSVMDetector(...)). As the SVM model is too large to fit with the HOG descriptor you created.
Make sure you are providing correct path to the SVM model.

So, keeping all of them in mind, a valid configuration can be:

WIN_SIZE = (96, 96)
hog = cv2.HOGDescriptor(_winSize = WIN_SIZE, 
                        _blockSize = (16, 16), 
                        _blockStride = (8, 8),
                        _cellSize = (8, 8),
                        _nbins = 9)

# Resize all of the training images to have the same size as the `_winSize`.
positive_features = [cv2.resize(img, WIN_SIZE) for img in positive_images])
negative_features = [cv2.resize(img, WIN_SIZE) for img in negative_images])

positive_features = np.array([hog.compute(img) for img in positive_images])
negative_features = np.array([hog.compute(img) for img in negative_images])
...