Search code examples
computer-visionconfigyolodarknetcustom-training

Why is the 'filters' set as (classes + 5) * 3 in this article?


Here's a tutorial about doing custom training of YOLO (Darknet): https://medium.com/@manivannan_data/how-to-train-yolov3-to-detect-custom-objects-ccbcafeb13d2

The tutorial guides how to set values in the .cfg files:

  • classes = Number of classes, OK
  • filters = (classes + 5) * 3

Why is it 'plus 5' then 'times 3'?

Some say it's (classes + coords + 1) * num, but I can't guess it out the meaning.


Solution

  • I've found the answer,

    filters = (classes + 5) * 3
    = (classes + width + height + x + y + confidence) * num
    = (classes + 1+1+1+1+1) * num
    = (classes + 5) * num
    

    YOLOv3 dectects 3 boxes per grid cell, so it is:

    filters = (classes + 5) * 3