machine-learning deep-learning computer-vision transfer-learning

Which deep learning model to use for capturing minor features in a image?

I have class which has slightly different features from the other class: ex - This image has buckle in it (consider it as a class) https://6c819239693cc4960b69-cc9b957bf963b53239339d3141093094.ssl.cf3.rackcdn.com/1000006329245-822018-Black-Black-1000006329245-822018_01-345.jpg

But This image is quite similar to it but has no buckle : https://sc01.alicdn.com/kf/HTB1ASpYSVXXXXbdXpXXq6xXFXXXR/latest-modern-classic-chappal-slippers-for-men.jpg

I am little confused about which model to use in these kind of cases which actually learns pixel to pixel values.

Any thoughts will be appreciable. thanks !!

I have already tried Inception,Resnet etc models.

With a less volume train data (300-400 around each class) can we reach a good recall/precision/F1 score.

Solution

You might want to look into transfer learning due to the small dataset, what you can do is use a transferred ResNet model to work as a feature extractor and try a YOLO(You only look once) algorithm on it, look through each window(Look Sliding window implementation using ConvNets) to obtain a belt buckle and based on that you can classify the image.

Based on my understanding of your dataset, to do the above approach though you will need to re-annotate your dataset as per the requirements of YOLO algorithm.

To look at an example of the above approach, visit https://mc.ai/implementing-yolo-using-resnet-as-feature-extractor/

Edit If you have XML annotated Dataset and need to convert it to csv to follow the above example use https://github.com/datitran/raccoon_dataset

Happy modelling.