Search code examples
algorithmmathcomputer-visioncamerageometry

Camera Geometry: Algorithm for "object area correction"


A project I've been working on for the past few months is calculating the top area of ​​an object taken with a 3D depth camera from top view.

workflow of my project:

  1. capture a group of objects image(RGB,DEPTH data) from top-view

  2. Instance Segmentation with RGB image

  3. Calculate the real area of ​​the segmented mask with DEPTH data

Some problem on the project:

  • All given objects have different shapes
  • The side of the object, not the top, begins to be seen as it moves to the outside of the image.
  • Because of this, the mask area to be segmented gradually increases.
  • As a result, the actual area of ​​an object located outside the image is calculated to be larger than that of an object located in the center.

In the example image, object 1 is located in the middle of the angle, so only the top of the object is visible, but object 2 is located outside the angle, so part of the top is lost and the side is visible.

Because of this, the mask area to be segmented is larger for objects located on the periphery than for objects located in the center.

I only want to find the area of ​​the top of an object.

example what I want image:

fig 2

Is there a way to geometrically correct the area of ​​an object located on outside of the image?

I tried to calibrate by multiplying the area calculated according to the angle formed by Vector 1 connecting the center point of the camera lens to the center point of the floor and Vector 2 connecting the center point of the lens to the center of gravity of the target object by a specific value. However, I gave up because I couldn't logically explain how much correction was needed.

fig 3: fig 3


Solution

  • What I would do is convert your RGB and Depth image to 3D mesh (surface with bumps) using your camera settings (FOVs,focal length) something like this:

    and then project it onto ground plane (perpendicul to camera view direction in the middle of screen). To obtain ground plane simply take 3 3D positions of the ground p0,p1,p2 (forming triangle) and using cross product to compute the ground normal:

    n = normalize(cross(p1-p0,p2-p1))
    

    now you plane is defined by p0,n so just each 3D coordinate convert like this:

    projection

    by simply adding normal vector (towards ground) multiplied by distance to ground, if I see it right something like this:

    p' = p + n * dot(p-p0,n)
    

    That should eliminate the problem with visible sides on edges of FOV however you should also take into account that by showing side some part of top is also hidden so to remedy that you might also find axis of symmetry, and use just half of top side (that is not hidden partially) and just multiply the measured half area by 2 ...