I cant figure out the difference between depth data that we get from device like Kinect which contains xyz and depth information and the point cloud data. can someone please explain it to me. thanks in advance.
A point cloud is a data structure. It is basically an array/a vector of points (each containing x,y,z coordinates for each point and possibly more information).
Depth data is the information about depth taken from a sensor that the point cloud can express. There are other data structures with which depth data can be expressed, such as 2D arrays of depth numbers (that is, only the z/depth expressed explicitly). In order to calculate depth data expressed as a point cloud from depth data expressed in a 2D array, you need the camera calibration information.