I have class which has slightly different features from the other class: ex - This image has buckle in it (consider it as a class) https://6c819239693cc4960b69-cc9b957bf963b53239339d3141093094.ssl.cf3.rackcdn.com/1000006329245-822018-Black-Black-1000006329245-822018_01-345.jpg
But This image is quite similar to it but has no buckle : https://sc01.alicdn.com/kf/HTB1ASpYSVXXXXbdXpXXq6xXFXXXR/latest-modern-classic-chappal-slippers-for-men.jpg
I am little confused about which model to use in these kind of cases which actually learns pixel to pixel values.
Any thoughts will be appreciable. thanks !!
I have already tried Inception,Resnet etc models.
With a less volume train data (300-400 around each class) can we reach a good recall/precision/F1 score.
You might want to look into transfer learning due to the small dataset, what you can do is use a transferred ResNet model to work as a feature extractor and try a YOLO(You only look once) algorithm on it, look through each window(Look Sliding window implementation using ConvNets) to obtain a belt buckle and based on that you can classify the image.
Based on my understanding of your dataset, to do the above approach though you will need to re-annotate your dataset as per the requirements of YOLO algorithm.
To look at an example of the above approach, visit https://mc.ai/implementing-yolo-using-resnet-as-feature-extractor/
Edit If you have XML annotated Dataset and need to convert it to csv to follow the above example use https://github.com/datitran/raccoon_dataset
Happy modelling.