I would like to experiment with machine learning (especially CNNs) on the aligned RGB and depth stream of either an Intel RealSense or an Orbbec Astra camera. My goal is to do some object recognisation and highlight/mark them in the output video stream (as a starting point).
But after having read many articles I am still confused about the involved frameworks and how the data flows from the camera through the involved software components. I just can't get a high level picture.
This is my assumption regarding the processing flow:
Sensor => Driver => libRealSense / Astra SDK => TensorFlow
Questions
Astra OpenNI SDK
besides the Astra SDK
where as Intel has wrappers (?) for OpenCV
and OpenNI
. When or why would I need this additional libraries/support?sensor -> driver -> camera library -> other libraries built on top of it
(see OpenCV support for Intel RealSense)-> captured image.
Once you got the image, you can do whatever you want of course.tf.data
and develop in tensorflow any application that uses CNNs on RGDB images (just google it and look on arxiv to have ideas about the possible applications).Once your model has been trained, just export the trained graph and use it in inference, hence your pipeline will become: sensor -> driver -> camera library -> libs -> RGBD image -> trained model -> model output