Search code examples
.netcameraonvif

Converting x/y values from on screen click to ONVIF PTZ pan/tilt values


I'm currently tasked with implementing some PTZ actions for an Onvif camera in C#. My camera has 360 degree pan, 220 degree tilt, a 63 degree horizontal FOV, and a 37 degree vertical FOV.

Right now I currently have a video feed in WPF that shows everything within the FOV. I want to be able to center the camera on whichever spot I click. I can easily get the x/y coordinates of my click, but I'm not sure how to convert that meaningfully.

The ONVIF API accepts Pan and Tilt vectors with X and Y between -1 and 1. The only data I can collect about the camera feed is it's current X and Y vector (between -1 and 1), the center of the video feed in X and Y (pixels), and the X and Y of the point I click.

I've tried every calculation I can think of to get a vector for a relative move or absolute. I used the comment from this post to calculate the degree delta based on the pixel delta, but the result doesn't even seem close. I used the formulas in this post to get a pan and tilt value, but I'm not quite sure how to use the results in a helpful way.

I've tried getting the distance percentage and applying that to the current position vector and that also didn't work.

I'm guessing that my best bet is finding doing a relative move since finding an absolute vector with just the FOV seems difficult. If anyone has any insight on how I could calculate it without too much trouble it would be greatly appreciated.


Solution

  • I solved this, and the solution was a lot less complicated then what I was trying to achieve. The answer wasn't some complicated formula, but gaining understanding of the Onvif specification. For anyone having similar issues, I'll explain what I did.

    Onvif has a few different spaces for absolute and relative translations, the default being [-1, 1] for x and y. However, you can provide a different space for translations. You can see the allowed spaces in the object returned from GetConfigurationOptions, which takes a configuration token. You can get all configuration tokens using GetConfigurations and choose one of those tokens if you aren't sure how to get one.

    From there, choose a scale that fits and observe the max and min for X and Y. In my case, the easiest was to use the space that let me do relative translations within the FOV. From there you just need to carefully calculate a delta between the center and your click using any method you'd like and create some value between [-1, 1]. The ONVIF service specs has a lot more info on this.