Search code examples
computer-visionobject-detection

Combine tracking and detection


I'm currently working on a multiple object tracking problem. I think using Tracking-by-Detection is a good choice. However, I do not know how to combine tracking and detection result so that detection can help improve tracking results.

I'm using Faster-RCNN, tensorflow object detection API as a simple starting point for detection. For tracking, I use KCF algorithm from opencv.

Detection is unstable because every frame is independent to the model, while tracking is much more stable.

Although tracking is more stable, when the object moves, tracker can not follow the object, which is not accurate.

So I'm thinking of combining these two methods to improve my result as both stable and accurate.

I have a background of computer vision but I'm new to this field (Multiple Object Tracking). Could anyone please give me some advice on how I should deal with this kind of problem ?

Thanks alot! :)


Solution

  • I have tried to use detection to track objects recently. The unstable problem can be resovled by classic filting techology such as Kalman filtering(In that field, the point from signal processing is also "unstable" due to noise.). You can set a small region around the tracked object and try to find same one in that region in next frame. A "matched" relationship is established from that, and then you try to match the object in next frame from next next one... A trace can be built from the process. Any smoothing method can be employed to suppress predicted box noise. A example can be shown in: enter image description here

    The transparent points are detected trace points and the soild one are smoothed points.

    The corresponding trace shown in background: enter image description here

    Some tricks are also useful, if detection fail on some random position, you can set a "skip gate", to try find one matching point in later frame(In my experiment, 60 is not bad for 24fps video). You will prefer recall more than accuracy since you can build a pretty long sequence and drop short noise sequence come from false alarm detection.

    Reference code:https://github.com/yiyuezhuo/detection-tracking