Search code examples
opencvimage-processingraspberry-picamera

640 x 480 image formation from OV5647 raspberry pi camera v1


I want to figure out exactly what is happening when I create a 640x480 pixel image from the OV5647 Pi camera v1.

This is what I think so far:

We start with the full FoV 2592x1944 pixel resolution with aspect ratio 4:3.

Now 640x480 resolution image is also 4:3 aspect and based on full FoV.

We start by binning: | Width | Height | |----|----| |2592|1944| |1296|972| |648|486|

e.g. 2592/2 = 1296; 1296/2 = 648

1944/2 = 972; 972/2=486

So after binning we get resolution of 648 x 486 but we want the output to be 640 x 480 so we have to deal with the extra 8 pixels horizontally and the extra 6 vertical pixels on the binned image.

My question is what actually happens to create an output image frame for the following cases: 640 x 480 resolution video is recorded with raspivid e.g. console command:

raspivid -o myvid.h264 -w 640 -h 480 -t 60000

If possible could someone explain the slight variation I see with 640 x 480 images create with raspivid and with OpenCV 4.0.0. The content of the images seems somewhat different e.g. slightly displaced but I am not sure if this is simple displacement e.g. taking from slightly different FoV or is one of the outputs actually performing a scale operation on the 648x486 binned image to generate the 640x480 result e.g. I have assumed that only binning and FoV clipping is done but actual scaling is possibility too especially for opencv. Code for camera image is captured with OpenCV 4.0.0:

cv::VideoCapture* video_capture_cap_;
video_capture_cap_ = new cv::VideoCapture();
    video_capture_cap_->open(0);
    if (video_capture_cap_->isOpened()) {
      video_capture_cap_->set(
          cv::CAP_PROP_FRAME_WIDTH,
          640);
      video_capture_cap_->set(
          cv::CAP_PROP_FRAME_HEIGHT,
          480);
      video_capture_cap_->set(
          cv::CAP_PROP_FPS,
          49);

Solution

  • It would seem the answer is the following:

    raspivid -o myvid.h264 -w 640 -h 480 -t 60000

    produces a video file of the .h264 format.

    Frames of the video are produced by 4x4 binning followed by scale.

    As output is 640 x 480 frames, the following is done.

    2592/2 = 1296; 1296/2 = 648

    1944/2 = 972; 972/2=486

    then scale 648 x 486 to 640 x 480 using a scale factor of (640.0/648.0).

    I am not sure if this concurs with the raspivid documentation which seems to suggest the contrary e.g. from documentation I was expecting cropping to the correct image size not scaling. However, inspecting video output suggests the that scaling takes place rather than cropping.

    Method was to take video as above of camera calibration checkerboard. Extract frames from video and compare size of and position checkerbaord squares with corresponding image taken via cv::VideoCapture.

    Image from cv::VideoCapture was formulated as described by Christoph Rackwitz e.g.

    do binning as above to get 648 x 486 image then crop to the desired 640 x 480 image size. Cropping appears to take the central 640 x 480 image regione.g. it drops the first 3 rows and the last 3 rows of the 648 x 486 image and drops the first and last 4 columns in each row.