I want to use YOLOv3 algorithm for detection. I am using Intel's DE10 Nano FPGA board with Linux installed. When I built YOLOv3 (from original source) and ran it, I am getting error "Segmentation fault (core dumped)". I did lot of research but none of them helped to fix this issue.
I used pre-built weights and configuration files i.e I ran the below command
./darknet detect cfg/yolov3-tiny.cfg yolov3-tiny.weights data/dog.jpg
but got the error as stated above.But the same thing runs with out any hiccups on my computer and several others, but not on my Devboard. Then I started debugging (using lot of printf statements) the code from "darknet.py" in "python" directory and found that the error resides in
"yolo_layer.c" file
line.no.336-> "dets[count].prob[j] = (prob > thresh) ? prob : 0;"
in "get_yolo_detections" function.
How can I fix this? I've followed function to function and file to file to see where the error is from.
int get_yolo_detections(layer l, int w, int h, int netw, int neth, float thresh, int *map, int relative, detection *dets)
{
int i,j,n;
float *predictions = l.output;
if (l.batch == 2) avg_flipped_yolo(l);
int count = 0;
for (i = 0; i < l.w*l.h; ++i){
int row = i / l.w;
int col = i % l.w;
for(n = 0; n < l.n; ++n){
int obj_index = entry_index(l, 0, n*l.w*l.h + i, 4);
float objectness = predictions[obj_index];
if(objectness <= thresh) continue;
int box_index = entry_index(l, 0, n*l.w*l.h + i, 0);
dets[count].bbox = get_yolo_box(predictions, l.biases, l.mask[n], box_index, col, row, l.w, l.h, netw, neth, l.w*l.h);
dets[count].objectness = objectness;
dets[count].classes = l.classes;
for(j = 0; j < l.classes; ++j){
int class_index = entry_index(l, 0, n*l.w*l.h + i, 4 + 1 + j);
float prob = objectness*predictions[class_index];
//|||||||error in below line||||||||
dets[count].prob[j] = (prob > thresh) ? prob : 0;
//^^--error in the above line(got from debugging)
}
++count;
}
}
correct_yolo_boxes(dets, count, w, h, netw, neth, relative);
return count;
}
Finally, found the problem and the solution. The problem lies in "src/parser.c" file when weights are being loaded. The function that loads weights from the '.weights' file relies upon the underlying machine architecture( 32-bit or 64-bit). The .weights file that is created when training is written with size 64-bit as the training is done on 64-bit machine, since devices like jetson, raspberry pi, de10 Nano, etc. have 32-bit architectures, they load weights as 32-bit format from weights file.
Thus there is compatibility issue( THE WEIGHTS ARE NOT CROSS PLATFORM ).
To Fix this issue, change
fwrite(net->seen, sizeof(size_t), 1, fp);// in save_weights_upto() function
to(in line 1024 - )
fwrite(net->seen, 8, 1, fp);
and-------------------------------------------------------------------
fread(net->seen, sizeof(size_t), 1, fp);//in load_weights_upto() function
to(in line 1237)
fread(net->seen, 8, 1, fp);