Search code examples
pythoncameracomputer-visioncamera-calibration

Get physical distance from camera intrinsics and camera distance


I have a camera intrinsics matrix K which is a 3x3 matrix, i.e. I have fx, fy, cx, cy all in pixel distance. The camera is facing a plane. The distance of camera from the plane is d metres. Suppose two pixels are dx and dy pixels apart in the x and y direction, what is the physical x and y distance between the two pixel points. Is all this information enough to calculate the physical distance?

Example:

height = 720
width = 1280
fx = 1.06935339e+03
fy = 1.07107059e+03
cx = 6.29035115e+02
cy = 3.54614962e+02

d = 0.7

dx = 168
dy = 39

Solution

  • Assuming that this is a pinhole camera model, you also need the width and height of the camera image sensor (CCD) - let's call these widthCCD and heightCCD respectively.

    You need to do two steps:

    1. Figure out the physical 3D projection (physical point onto the camera sensor CCD)
    2. Figure out the image projection (CCD into image pixel space)

    Let's assume you have two pixels on the image (u1, v1) and (u2, v2). These two pixels map to the following pixels on the CCD sensor (uc1, vc1) and (uc2, vc2). And finally, those two CCD pixels map to the following physical 3D coordinates (X1, Y1, Z1) and (X2, Y2, Z2) as follows:

    Note: Z1 = Z2 = d = 0.7 (based on your information provided)


    Physical Projection (3D to CCD):

    (uc1, vc1) -> (fx * X1/Z1, fy * Y1/Z1)
    (uc2, vc2) -> (fx * X2/Z2, fy * Y2/Z2)
    

    Image Projection (CCD to Image):

    (u1, v1) -> (uc1 * width/widthCCD + cx, vc1 * height/heightCCD + cy)
    (u2, v2) -> (uc2 * width/widthCCD + cx, vc2 * height/heightCCD + cy)
    

    By applying substitution you can arrive at:

    (u1, v1) -> ((fx * X1/Z1) * width/widthCCD + cx, (fy * Y1/Z1) * height/heightCCD + cy)
    (u2, v2) -> ((fx * X2/Z2) * width/widthCCD + cx, (fy * Y2/Z2) * height/heightCCD + cy)
    

    Since I don't know the CCD sensor height and width, I will just assume that the CCD height and width is the same as the image, for this example:

    1. Plug in Z1 = Z2 = d:
    (u1, v1) -> ((fx * X1/d) + cx, (fy * Y1/d) + cy)
    (u2, v2) -> ((fx * X2/d) + cx, (fy * Y2/d) + cy)
    
    1. Now lets find the physical distance between two pixels: (0, 0) and (168, 39)
    (u, v) -> ((1.06935339e+03 * X/0.7) + 6.29035115e+02, (1.07107059e+03 * Y/0.7) + 3.54614962e+02)
    

    For (0, 0), X = -0.41, Y = -0.23 For (168, 39), X = -0.30, Y = -0.21

    1. Find Euclidean distance between the two 3D points:

    Distance between (-0.41, -0.23) -> (-0.30, -0.21) = 0.11m

    So the physical distance between the two points, assuming the CCD sensor is the same as the image plane, is 0.11 meters.