I'm new to computer vision. I'm working on a research project whose objective is (1) to detect vehicles from images and videos, and then later on (2) to be able to track moving vehicles.
I'm at the initial stage where I'm collecting training data, and I'm really concerned about getting images which are at an optimum resolution for detection and tracking.
Any ideas? The current dataset I've been given (from a past project) has images of about 1200x600 pixels. But I've been told this may or may not be an optimum resolution for the detection and tracking task. Apart from considering the fact that I will be extracting haar-like features from the images, I can't think of any factor to include in making a resolution decision. Any ideas of what a good resolution ought to be for training data images in this case?
First of all, feeding raw images directly to classifiers does not produce great results although sometimes useful such as face-detection. So you need to think about feature extraction.
One big issue is that a 1200x600 has 720,000 pixels. This defines 720,000 dimensions and it poses a challenge for training and classification because of dimension explosion.
So basically you need to scale down your dimensions particularly using feature extraction. What features to detect? It completely depends on the domain.
Another important aspect is the speed. Processing bigger images takes more time and this is especially important for processing real-time images which is something of 15-30 fps.
In my project (see my profile) which was real-time (15fps), I was working on 640x480 images and for some operations I had to scale down to improve performance.
Hope this helps.