Search code examples
c++opencvgrayscalehdl

Color to grayscale conversion


Im using a C++ openCV program for first principles Algorithm development for HDL(Verilog) image object detection. I've finally managed to get HDL version up to the point of canny detection. In order to validate the two, both need to have identical output. I have found their are subtle differences that I thing are being contributed to by the openCV imread colour to grayscale conversion biasing green. The smoothed image is overall brighter in the openCV C++ method. From looking at the rgb2gray method it appears openCV used a bias ie (RX+GY+B*Z)/3 while in HDL I have been using (R+G+B)/3 as I require it to complete Gaussian, Sobel and Canny filters. Human visualisation is secondary and multiplication by a non-int is undesirable.

Is there a standard linear grayscale conversion for conversion or a means to override the existing method? ...

int main()
{
            int thold = 15;

            clock_t start;
            double duration;
            const int sobelX[3][3] = { {-1, 0, 1}, {-2, 0, 2}, {-1, 0, 1} };  //Where origionally floats in python
            const int sobelY[3][3] = { {-1, -2, -1}, {0, 0, 0}, {1, 2, 1} }; //Where origionally floats in python
            const int kernel[5][5] = { {1,6,12,6,1},
                                                                                                {6,42,79,42,6},
                                                                                                                        {12,79,148,79,12},
                                                                                                                        {6,42,79,42,6},
                                                                                                                        {1,6,12,6,1} };// 1/732
            // Above normalised kernal for smoothing,  see origional python script for method 
            start = std::clock();
            int height, width, intPixel, tSx, tSy, tS, dirE, dirEE, maxDir, curPoint, contDirection, cannyImgPix, nd, tl, tm, tr, mr, br, bm, bl, ml = 0;
            int contNum = 128;
            int contPixCount = 0;
            int curContNum = 0;
            int contPlace = 0;
            int oldContPlace = 0;
            int g = 0;
            bool maxPoint;
            struct pixel {
                        int number;
                        int h;
                        int w;

            };
            std::vector<pixel> contourList;
            //double floatPixel = 0.0;
            int kernalCumulator = 0;
            const int mp = 3;
        //  Scalar color(0, 0, 255);
            //          duration = ((clock()) - start) / (double)CLOCKS_PER_SEC;
            //          start = clock();
            //          cout << "Start image in" << duration << '\n';
            //          Mat dst;
            Mat rawImg = imread("C:\\Users\\&&&\\Documents\\pycode\\paddedGS.png",0);
            printf("%d",rawImg.type());

//          Mat rawImg = imread("C:\\Users\\&&&\\Documents\\openCV_Master\\openCVexample\\openCVexample\\brace200.jpg ", 0);
            height = rawImg.rows;
            width = rawImg.cols;
            cout << "Height of image " << height << '\n';
            cout << "Width of image " << width << '\n';
            Mat filteredImg = Mat::zeros(height, width, CV_8U);
            printf("%d", filteredImg.type());
            Mat sobelImg = Mat::zeros(height, width, CV_8U);
            Mat directionImg = Mat::zeros(height, width, CV_8U);
            Mat cannyImg = Mat::zeros(height, width, CV_8U);
            Mat contourImg = Mat::zeros(height, width, CV_16U);

//          rawImg.convertTo(rawImg, CV_8UC1);

            duration = ((clock()) - start) / (double)CLOCKS_PER_SEC;
            start = clock();
            cout << "Start image in" << duration << '\n';
            // Loop to threshold already grayscaled image           
            /*
            for (int h = 0; h < (height); h++)
            {
                        for (int w = 0; w < (width); w++)
                        {
                                    g = (int)rawImg.at<uchar>(h, w,0);
                                    cout << g << "g";
                                    g+= (int)rawImg.at<uchar>(h, w, 1);
                                    cout << g << "g";
                                    g+= (int)rawImg.at<uchar>(h, w, 2);
                                    cout << g << "g";
                                    g = g/3;
                                    rawGImg.at<uchar>(h,w) = g;
                        }
            }

            */
            //          imshow("thresholded Image", rawImg);
            //          waitKey();
            // Loop to smooth using Gausian 5 x 5 kernal

//          imshow("raw Image", rawImg);


            for (int h = 3; h < (height - 3); h++)
            {
                        for (int w = 3; w < (width - 3); w++)
                        {
                                    if (rawImg.at<uchar>(h, w) >=6 )//Thresholding included
                                    {
                                                for (int xk = 0; xk < 5; xk++)
                                                {
                                                            for (int yk = 0; yk < 5; yk++)
                                                            {
                                                                        intPixel = rawImg.at<uchar>((h + (xk - mp)), (w + (yk - mp)));
                                                                        kernalCumulator += intPixel*(kernel[xk][yk]);//Mutiplier required as rounding is making number go above 255,  better solution?
                                                            }
                                                }
                                    }
                                    else
                                                kernalCumulator = 0;

                                    kernalCumulator = kernalCumulator / 732;
                                    if (kernalCumulator < 0 || kernalCumulator > 255)
                                    {
        //                                      cout << "kernal Value: " << kernalCumulator;
            //                                  cout << " intPixel:" << intPixel << '\n';
                                    }
                                    filteredImg.at<uchar>(h, w) = (uchar)kernalCumulator;
                                    kernalCumulator = 0;
                        }
            }

Solution

  • Our vision does not perceive linearly the brightness, so it makes sense for usual applications to use some sort of transformation that tries to mimic the human perception.

    For your application, you have 2 options: either use a similar transformation in HDL (which might not be easy or desired), or make a custom rgb to grayscale for OpenCV which uses the same transformation you use.

    A short snippet (more like pseudocode, you'll have to figure out the details) for this would be something like:

    cv::Mat linearRgbToGray(const cv::Mat &color) {
        cv::Mat gray(color.size(), CV_8UC1);
        for (int i = 0; i < color.rows; i++)
            for (int j = 0; j < color.cols; j++)
               gray.at(i, j) = (color.at(i, j)[0] + color.at(i, j)[1] + color.at(i, j)[2]) / 3;
    }