neural-network computer-vision classification darknet

Darknet: How labels used for classification?

I want to use YOLOv3 for classification without bbox detection, and am following this link how to train a classifier with darknet

However, I can't seem to spot where exactly the class for each image is assigned. There isn't any .txt file for each image like for detection. Is the class simply extracted from the file name?

How do I assign a class to each image in my multi-class classification? Or do I simply throw in the images and YOLO learns by itself based on the number of classes you specify in labels.txt?

Solution

Yes, the class of each training images is extracted from the file name. Check out the load_labels_path() and fill_truth() functions in src/data.c. load_labels_path() creates a # of images by # of classes matrix, and fill it with 1 and 0 whether the image belong to the class using fill_truth(), which tests whether a class name is a substring of a image file name.

matrix load_labels_paths(char **paths, int n, char **labels, int k, tree *hierarchy)
{
    matrix y = make_matrix(n, k);
    int i;
    for(i = 0; i < n && labels; ++i){
        fill_truth(paths[i], labels, k, y.vals[i]);
        if(hierarchy){
            fill_hierarchy(y.vals[i], k, hierarchy);
        }
    }
    return y;
}

...

void fill_truth(char *path, char **labels, int k, float *truth)
{
    int i;
    memset(truth, 0, k*sizeof(float));
    int count = 0;
    for(i = 0; i < k; ++i){
        if(strstr(path, labels[i])){
            truth[i] = 1;
            ++count;
            //printf("%s %s %d\n", path, labels[i], i);
        }
    }
    if(count != 1 && (k != 1 || count != 0)) printf("Too many or too few labels: %d, %s\n", count, path);
}

https://github.com/pjreddie/darknet/blob/master/src/data.c#L620 https://github.com/pjreddie/darknet/blob/master/src/data.c#L543