I am trying to evaluate the training function of the Watson visual Recognition API. Has anyone some experience with costumizing classifers for Visual Recognition? I have some expierence myself with training the classifier and found some infomation in this blog: http://christopher5106.github.io/computer/vision/2016/12/23/ibm-watson-bluemix-visual-api-to-create-custom-classifier.html
What I really would like to know is how much pictures do I need of an object to classify it with an accuracy of 75%? How long does it take to get such a result?
Thank you in advance for your help.
The number of pictures you need depends on how unique the object is, how many distinct image features a picture with it has, etc.
To give you a few examples from my own experience:
Logo detection: one image of the logo can be used to create several samples by adding noise, changing contrast, making small distortions and rotations, etc. If the logo is detailed and has good contrast, you should easily get 75%.
Cat detection using Haar wavelets: 100 images with data augmentation can yield around 75%
Human ear detection: about 300 images could get me to around 80%. This detector is being used in an iPhone app for virtual-trying eyeglasses.
You can also try this out yourself using Kaggle's Dogs Vs. Cats data. Just try various classifiers on them with different amounts of data, and you will get a very good idea.