Search code examples
pythonperformancecomputer-visioncythonpoint-clouds

Speeding up depth image to point-cloud conversion (Python)


I am currently trying to speed up the conversion from depth to point cloud in python. At the moment the code looks like this:

   color_points = []
   for v in range(depth.shape[1]):
        for u in range(depth.shape[0]):
            z = depth[u,v] / self.scaling_factor
            if z == 0:   # here np.where can be used prior to the loop
                continue
            x = (u - self.center_x) * z / self.focalLength     # u and v are needed, no clue how to do this
            y = (v - self.center_y) * z / self.focalLength

            color_points.append([x, y, z])

My main issue is, that I don't know any function (like np.where) that would speed up the looping through the image array, while still having the indices of the pixel (e.g. u and v). The indices are needed, as the calculation depends on the location of the pixel on the sensor.

Cython would be an option to speed it up, but I still think there must be a better function available within NumPy.

Thanks for any help!

Edit: Added Z calculation to the function. The scaling factor depends on the format you get the depth image in. In my case, I get the depth in [mm] and convert it to meters.


Solution

  • Here is my stab at it, using np.mgrid to compute u and v in a numpy array which avoids the slow looping and allows to take advantage of numpy's universal functions.

    import numpy as np
    depth = np.random.uniform(size=(4,4)) # Just so this sample can be run
    grid = np.mgrid[0:depth.shape[0],0:depth.shape[1]]
    u, v = grid[0], grid[1]
    z = depth / scaling_factor
    adjusted_z = z / focal_length
    x = (u - center_x) * adjusted_z
    y = (v - center_y) * adjusted_z
    color_points = np.stack([x,y,z], axis=-1).reshape(-1,3)
    

    Note that I haven't filtered the color_points for z=0, but you can do it using color_points[z.reshape(-1) != 0]. Also be mindful that my color_points will be ordered differently from yours (u before v). It probably doesn't matter in depth estimation scenarios.