computer-vision object-detection image-segmentation semantic-segmentation

Semantic Segmentation or Object Detection?

I'm working on a project which I try to detect and track the herds of sheep and goat. I'm in dilemma about using Semantic Segmentation or Object Detection. Large number of sheep/goat are close to each other and there is an average 80-90 sheep/goat in a single frame. Which one do you think best suits for my problem?

In addition, do you have any model you recommend? (First priority is accuracy.)

Solution

Semantic segmentation will do you no good: The goal of semantic-segmentation is to label each pixel in the image to its semantic class. Therefore, all goats in the frame will have the same label "goat", and all sheep will have the same label "sheep".
What you might be considering is Instance-segmentation: In this task, the goal is not only to associate each pixel with its semantic class (as in semantic segmentation), but additionally to be able to set apart different instances of the same class. In your example, a good semantic segmentation will be able not only to label all goat pixels as "goats", but to separate the different goats in the frame accurately.

A very popular semantic segmentation approach is mask-RCNN: it's basically a two-stage system. First, it detects the different objects and then provides a segmentation mask for each individual bounding box. This might be a good starting point for your project: It will allow you to compare your counts based on the detection or the instance-segmentation output of the model.