I'm absolutely new to ROS/Gazebo world; this is probably a simple question, but I cannot find a good answer.
I have a simulated depth camera (Kinect) in a Gazebo scene. After some elaborations, I get a point of interest in the RGB image in pixel coordinates, and I want to retrieve its 3D coordinates in the world frame. I can't understand how to do that.
I have tried compensating the distortions, given by the CameraInfo msg. I have tried using PointCloud with pcl library, retrieving the point as cloud.at(x,y)
.
In both cases, the coordinates are not correct (I have put a small sphere in the coords given out by the program, so to check if it's correct or no).
Every help would be very appreciated. Thank you very much.
EDIT: Starting from the PointCloud, I try to find the coords of the points doing something like:
point = cloud.at(xInPixel, yInPixel);
point.x = point.x + cameraPos.x;
point.y = point.y + cameraPos.y;
point.z = point.z - cameraPos.z;
but the x,y,z coords I get as point.x
seems not to be correct.
The camera has a pitch angle of pi/2, so to points on the ground.
I am clearly missing something.
I assume you've seen the gazebo examples for the kinect (brief, full). You can get, as topics, the raw image, raw depth, and calculated pointcloud (by setting them in the config):
<imageTopicName>/camera/color/image_raw</imageTopicName>
<cameraInfoTopicName>/camera/color/camera_info</cameraInfoTopicName>
<depthImageTopicName>/camera/depth/image_raw</depthImageTopicName>
<depthImageCameraInfoTopicName>/camera/dept/camera_info</depthImageCameraInfoTopicName>
<pointCloudTopicName>/camera/depth/points</pointCloudTopicName>
Unless you need to do your own things with the image_raw for rgb and depth frames (ex ML over rgb frame & find corresponding depth point via camera_infos), the pointcloud topic should be sufficient - it's the same as the pcl pointcloud in c++, if you include the right headers.
Edit (in response):
There's a magical thing in ros called tf
/tf2
. Your pointcloud, if you look at the msg.header.frame_id
, says something like "camera"
, indicating it's in the camera frame. tf
, like the messaging system in ros, happens in the background, but it looks/listens for transformations from one frame of reference to another frame. You can then transform/query the data in another frame. For example, if the camera is mounted at a rotation to your robot, you can specify a static transformation in your launch file. It seems like you're trying to do the transformation yourself, but you can make tf
do it for you; this allows you to easily figure out where points are in the world/map frame, vs in the robot/base_link frame, or in the actuator/camera/etc frame.
I would also look at these ros wiki questions which demo a few different ways to do this, depending on what you want: ex1, ex2, ex3