Search code examples
opencvimage-processingcomputer-visionransacfeature-matching

why is cv::warpPerspective outputting different values for the same inputs on Python vs. C++?


Context:

I am trying to align an image using another image as reference, by using ORB + Ransac. I use python to prototype and c++ to deploy to android via dart:ffi. However, the result is good in python but not as good in c++, even tho, i think i wrote equivalent codes.


Configs:

C++

  • Opencv 4.8.0 (downloaded from opencv.org and compiled with mingw)
  • Compiler: mingw
  • CMAKE

Python

  • 3.10.11
  • opencv-python == 4.8.0.76 (the closest to 4.8.0 I've found);

Inputs

The two input images are here

Outputs

I was expecting the same output for the two codes, but that's not the case. Notice the black border on the bottom right in the cpp image.

cpp_warped, py_warped


Code

C++

#include <opencv2/opencv.hpp>

using std::string;
using cv::Mat;

Mat warp(Mat source, Mat reference){
    
    cv::Ptr<cv::ORB> detector = cv::ORB::create(5000);

    cv::Mat d1, d2;
    std::vector<cv::KeyPoint> kp1, kp2;

    detector->detectAndCompute(source, cv::Mat(), kp1, d1);
    detector->detectAndCompute(reference, cv::Mat(), kp2, d2);

    cv::Ptr<cv::BFMatcher> matcher = cv::BFMatcher::create(cv::NORM_HAMMING, true);
    std::vector<cv::DMatch> matches;
    matcher->match(d1, d2, matches);

    // sort by distances (ascending)
    std::sort(matches.begin(), matches.end(), [] (cv::DMatch a, cv::DMatch b){
        return a.distance < b.distance;
    });

    // take only top 90% matches
    int n_matches = floor(matches.size() * 0.9);

    std::vector<cv::Point2f> p1, p2;

    for(int i = 0; i < n_matches; i++){
        p1.push_back(kp1[matches[i].queryIdx].pt);
        p2.push_back(kp2[matches[i].trainIdx].pt);
    }

    cv::Mat homography = cv::findHomography(p1, p2, cv::RANSAC);

    cv::Mat warped_mat;
    cv::warpPerspective(source, warped_mat, homography, cv::Size(reference.cols, reference.rows));

    cv::convertScaleAbs(warped_mat, warped_mat);

    return warped_mat;
}

int main(int argc, char const *argv[])
{
    Mat source = cv::imread("../source.jpg", cv::IMREAD_GRAYSCALE);
    Mat reference = cv::imread("../reference.png", cv::IMREAD_GRAYSCALE);

    // cv::imshow("source", source);
    // cv::waitKey(0);
    // cv::destroyAllWindows();

    Mat warped = warp(source, reference);


    cv::imwrite("cpp_warped.png", warped);
    return 0;
}

Python

import cv2 as cv
import numpy as np

def warp(source, reference):
    detector = cv.ORB.create(nfeatures=5000)

    kp1, d1 = detector.detectAndCompute(source, None)
    kp2, d2 = detector.detectAndCompute(reference, None)

    matcher = cv.BFMatcher(cv.NORM_HAMMING, crossCheck=True)
    matches = matcher.match(d1, d2)
    matches = sorted(matches, key= lambda x: x.distance)    

    #take top 90%
    matches = matches[:int(len(matches) * 0.9)]
    n_matches = len(matches)

    p1 = np.zeros((n_matches, 2))
    p2 = np.zeros((n_matches, 2))

    for i in range(n_matches):
        p1[i, :] = kp1[matches[i].queryIdx].pt
        p2[i, :] = kp2[matches[i].trainIdx].pt

    homography, _ = cv.findHomography(p1, p2, cv.RANSAC)

    height, width = reference.shape
    warped = cv.warpPerspective(source, homography, (width, height))
    warped = cv.convertScaleAbs(warped)

    return warped

if __name__ == "__main__":
    source = cv.imread("source.jpg", cv.IMREAD_GRAYSCALE)
    reference = cv.imread("reference.png", cv.IMREAD_GRAYSCALE)
    warped = warp(source, reference)
    cv.imwrite("py_warped.png", warped)

I double checked the default cv::ORB constructor parameters from python bindings and opencv c++, they are the same. I also tried changing some parameters on the c++ code, but didn't achieve any result as good as in python.

Edit:

Output of getbuildInformation: python, c++

Output of drawMatches: py_matches cpp_matches


Solution

  • Your sheet of paper is warped. It's not a flat plane in space. You put it half on top of your keyboard, half on the desk. It's warped.

    RANSAC and all those will latch onto the largest cluster of agreeing matches. Those come from the textured area of the sheet, which is the multiple choice bubbles and the header.

    The registration marks should be discarded as outliers by any algorithm that is given non-specific extracted features.

    The nondeterministic nature of the solver algorithms means that you'd have to run all of this multiple times, then calculate statistics on the outcomes, and compare those. When you compare single results that are based on noise/randomness (that is different for each case!), you can't make any general statements.


    Your code lacks Lowe's ratio test. It discards weak matches, those that aren't clear winners. Browse samples/python/find_obj.py in the OpenCV source tree (e.g. on GitHub) for details on how this is done.