I am working on a dog detection system using deep learning (Tensorflow object detection) and Real Sense D425 camera. I am using the Intel(R) RealSense(TM) ROS Wrapper in order to get images from the camera.
I am executing "roslaunch rs_rgbd.launch" and my Python code is subscribed to "/camera/color/image_raw" topic in order to get the RGB image. Using this image and object detection library, I am able to infer (20 fps) the location of a dog in a image as a box level (xmin,xmax,ymin,ymax)
I will like to crop the PointCloud information with the object detection information (xmin,xmax,ymin,ymax) and determine if the dog is far away or near the camera. I will like to use the aligned information pixel by pixel between the RGB image and the pointcloud.
How can I do it? Is there any topic for that?
Thanks in advance
Intel publishes their python notebook about the same problem at: https://github.com/IntelRealSense/librealsense/blob/jupyter/notebooks/distance_to_object.ipynb
What they do is as follow :
use ssd to detect the dog inside color frame
depth = np.asanyarray(aligned_depth_frame.get_data())
# Crop depth data:
depth = depth[xmin_depth:xmax_depth,ymin_depth:ymax_depth].astype(float)
# Get data scale from the device and convert to meters
depth_scale = profile.get_device().first_depth_sensor().get_depth_scale()
depth = depth * depth_scale
dist,_,_,_ = cv2.mean(depth)
print("Detected a {0} {1:.3} meters away.".format(className, dist))
Hope this help