What are the steps to
a) get a kinect's position by evaluating it's sensor data(ie depth stream, video stream, audio stream)
b) get a regular camera's position by evaluating it's sensor data(ie video stream)
Maybe the accelerometer data could help you. Look at this http://www.youtube.com/watch?v=GWvcgZkADUU