I have read a number of papers on Yolov5 images detection techniques. But the papers don't refers to any segmentation step done by Yolov5. While I know that it is not possible to do image classification without a segmentation process, I am asking the following question: do Yolov5 do any segmentation step in order to detect images? If yes which segmentation algorithm does it use?
segmentation mainly uses Fully Convolutional Network(FCN) architecture. FCN is a CNN without fully connected layers(FC). segmenation can be thought as an encoder followed by a decoder. Here encoder and decoder is FCN.
classification using CNN is a set of convolutional layers(extract high level features of input image) followed by one or more fully connected(FC) layers or dense layers.Last dense/FC layer classify the input image into various classes.
YOLO is a regression based object detection algorithm based on CNN architecture.In YOLO image is split or segmented into S * S grid cells.Each grid cell predict only one object that means a cell tries to predict an object whose centre falls inside that cell. For each grid cell CNN predicts
Shape of prediction/output of CNN will be (S , S, (B * 5 + C)) ; number 5 represent x_center,y_center,width,height of bounding box and its confidence score
If an image is divided into 7 * 7 grid cells , 2 bounding boxes are predicted for each cell and the total number of classes are 3, then shape of CNN output will be (7,7,13)