TensorFlow Custom Object Detection Disappointing Result - Why?

I have just started TF Object Detection API two weeks ago, and manage to train a model to recognize a custom object, in my case, a Mecanum wheel.

Here's the details:

No. of training images = 125
All training images are around 500 x 500 (plus minus)
Transfer Learning
Model used = ssd_mobilenet_v1_coco
batch size = 2
total steps ran = 12715
loss is around 0.5000 - 2.5000, some time it fluctuate to more than 10, I am not sure why

Here's the result: The first image is encouraging.

The second image starts to disappoint me a little. I expect the model to detect FOUR (four boxes) Mecanum wheel. Why?

Then, I suspect that's there's something wrong with my trained model. I tried with the sample test images, the third image and fourth image, then I am sure that this is totally not the model I first aim for.

I have been reading this post which I think our problems are quite similar (and he manage to solve it). He mentioned that the input image needs to be less than 600 x 1024, so I tried with fifth image and unsurprisingly, the result is again disappointing.

I went through the tutorial series by sentdex and in the comment sections, I notice that there are many people face this problem too. So, what to do now?

Can someone please help me to edit the list? Why can't I make it to one paragraph one list?

Solution

125 images? You will not be able to get very good results with that many images. If you want to validate that this is indeed the problem, try training with just subsets of your original 125 images.

For example, how bad is the output when you train on 10 images?

Does it get better when you use 50 images?

Does it get better yet when you use 125 images?

If the accuracy improves with increasing dataset size, you can extrapolate and guess that with 1000 images, you will be able to do even better. I would guess that that is your problem.