deep-learning computer-vision object-detection yolo

Creating a dataset of images for object detection for extremely specific task

Even though I am quite familiar with the concepts of Machine Learning & Deep Learning, I never needed to create my own dataset before.

Now, for my thesis, I have to create my own dataset with images of an object that there are no datasets available on the internet(just assume that this is ground-truth).

I have limited computational power so I want to use YOLO, SSD or efficientdet.

Do I need to go over every single image I have in my dataset by my human eyes and create bounding box center coordinates and dimensions to log them with their labels?

Thanks

Solution

Yes, you will need to do that.

At the same time, though the task is niche, you could benefit from the concept of transfer learning. That is, you can use a pre-trained backbone in order to help your model to learn faster/achieve better results/need fewer annotations example, but you will still need to annotate the new dataset on your own.

You can use software such as LabelBox, as a starting point, it is very good since it allows you to output the format in Pascal(VOC) format, YOLO and COCO format, so it is a matter of choice/what is more suitable for you.