neural-network deep-learning conv-neural-network yolo darknet

DarkNet - Nothing is detected for the custom training data

I have created my own dataset which is a set of soccer ball images. Since I have only 1 class, I have modified the ball-yolov3-tiny.cfg as setting the filters to 18, and classes to 1.

Then I have annotated the images and put the created .txt files into the same directory of the images. Finally, I have started the training by using the darknet53.conv.74 model by executing the command darknet detector train custom/ball-obj.data custom/ball-yolov3-tiny.cfg darknet53.conv.74.

I have 134 images for training, and 15 images for the test. Here is a sample output of the training process:

95: 670.797241, 597.741333 avg, 0.000000 rate, 313.254830 seconds, 6080 images
Loaded: 0.000302 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499381, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344946, Class: 0.498204, Obj: 0.496005, No Obj: 0.496541, .5R: 0.000000, .75R: 0.000000,  count: 32
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499381, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344946, Class: 0.498204, Obj: 0.496005, No Obj: 0.496541, .5R: 0.000000, .75R: 0.000000,  count: 32
96: 670.557190, 605.022949 avg, 0.000000 rate, 312.962750 seconds, 6144 images
Loaded: 0.000272 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499360, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344946, Class: 0.498204, Obj: 0.495868, No Obj: 0.496454, .5R: 0.000000, .75R: 0.000000,  count: 32
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499360, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344946, Class: 0.498204, Obj: 0.495868, No Obj: 0.496454, .5R: 0.000000, .75R: 0.000000,  count: 32
97: 670.165161, 611.537170 avg, 0.000000 rate, 312.681998 seconds, 6208 images
Loaded: 0.000282 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499331, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344946, Class: 0.498204, Obj: 0.495722, No Obj: 0.496397, .5R: 0.000000, .75R: 0.000000,  count: 32
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499331, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344946, Class: 0.498204, Obj: 0.495722, No Obj: 0.496397, .5R: 0.000000, .75R: 0.000000,  count: 32
98: 669.815918, 617.365051 avg, 0.000000 rate, 319.203044 seconds, 6272 images
Loaded: 0.000244 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499294, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344947, Class: 0.498204, Obj: 0.495569, No Obj: 0.496253, .5R: 0.000000, .75R: 0.000000,  count: 32
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499294, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344947, Class: 0.498204, Obj: 0.495569, No Obj: 0.496253, .5R: 0.000000, .75R: 0.000000,  count: 32
99: 669.555664, 622.584106 avg, 0.000000 rate, 320.330266 seconds, 6336 images
Loaded: 0.000244 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499246, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344948, Class: 0.498204, Obj: 0.495409, No Obj: 0.496197, .5R: 0.000000, .75R: 0.000000,  count: 32
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499246, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.344948, Class: 0.498204, Obj: 0.495409, No Obj: 0.496197, .5R: 0.000000, .75R: 0.000000,  count: 32
100: 669.132629, 627.238953 avg, 0.000000 rate, 329.954091 seconds, 6400 images
Saving weights to backup//ball-yolov3-tiny.backup
Saving weights to backup//ball-yolov3-tiny_100.weights
Resizing
576
Loaded: 1.764142 seconds
Region 16 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.499216, .5R: -nan, .75R: -nan,  count: 0
Region 23 Avg IOU: 0.430712, Class: 0.498203, Obj: 0.495251, No Obj: 0.496154, .5R: 0.000000, .75R: 0.000000,  count: 32

Here are the other configuration files:

ball-obj.data

classes= 1
train  = custom/ball-train.txt
valid  = custom/ball-test.txt
names = custom/ball-obj.names
backup = backup/

ball-obj.names

ball

When I use the created weights in order to test a single image, it simply fails to find the soccer balls in the images. Do I need a lot more (e.g. 10K) images for that? Or do I need to train the model for long hours? I just want to be sure that everything regarding my setup is OK.

Please feel free to ask any queries regarding my experiment. Your help is really appreciated. Thanks in advance.

p.s. Here is the whole content of my ball-yolov3-tiny.cnf:

[net]
# Testing
batch=1
subdivisions=1
# Training
#batch=64
#subdivisions=2
width=416
height=416
channels=3
momentum=0.9
decay=0.0005
angle=0
saturation = 1.5
exposure = 1.5
hue=.1

learning_rate=0.001
burn_in=1000
max_batches = 500200
policy=steps
steps=400000,450000
scales=.1,.1

[convolutional]
batch_normalize=1
filters=16
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=32
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=64
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=128
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=2

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[maxpool]
size=2
stride=1

[convolutional]
batch_normalize=1
filters=1024
size=3
stride=1
pad=1
activation=leaky

###########

[convolutional]
batch_normalize=1
filters=256
size=1
stride=1
pad=1
activation=leaky

[convolutional]
batch_normalize=1
filters=512
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear



[yolo]
mask = 3,4,5
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
classes=1
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

[route]
layers = -4

[convolutional]
batch_normalize=1
filters=128
size=1
stride=1
pad=1
activation=leaky

[upsample]
stride=2

[route]
layers = -1, 8

[convolutional]
batch_normalize=1
filters=256
size=3
stride=1
pad=1
activation=leaky

[convolutional]
size=1
stride=1
pad=1
filters=18
activation=linear

[yolo]
mask = 0,1,2
anchors = 10,14,  23,27,  37,58,  81,82,  135,169,  344,319
classes=1
num=6
jitter=.3
ignore_thresh = .7
truth_thresh = 1
random=1

The command I execute is:

darknet detector train custom/ball-obj.data custom/ball-yolov3-tiny.cfg darknet53.conv.74

Solution

Increase batch size to 64, and use as few subdivisions as your GPU memory can fit: start with 1, 2, 4, 8, 16, 32 and finally 64 if you keep on getting CUDA out of memory.

You should train your network until your average loss rate is < 1.

Are you using the original version of darknet from Joseph Redmons repository, or are you using a fork? There are a series of recommendations for how to improve object detection here, however, I am uncertain about whether or not they work on all other versions.