Search code examples
caffe

Fine tuning a pre-trained network with new data


I have a pre-trained network (the prototxt definition and binary caffemodel with the weights) designed for image recognition. I got it on-line, without knowing how it was trained, on which data, and i haven't seen the solver file. The network has 3 layers (as far as i can tell - i have 3 prototxt files).

I'm trying to add another "feature" to the network - make it recognize some pose as well.

The steps I've taken so far: - Add another output to the last layer, similar to the outputs that were already there - Process the image database through the first two layers, and save the output to lmdb - create a new solver for fine-tuning - create a train_test for fine tuning the last layer

Running "caffe train" with the solver simply crashes. I tried figuring out more by going into python and:

caffe.Net(train_test_file_path)

I got:

I0703 11:10:54.095563 21756 net.cpp:294] The NetState phase (1) differed from the phase (0) specified by a rule in layer data
I0703 11:10:54.095655 21756 net.cpp:51] Initializing net from parameters:
 <train_test_file_content>

I0703 11:10:54.096817 21756 layer_factory.hpp:77] Creating layer data
I0703 11:10:54.097033 21756 db_lmdb.cpp:35] Opened lmdb /home/user/yaw_db/test/lmdb/
I0703 11:10:54.097090 21756 net.cpp:84] Creating Layer data
I0703 11:10:54.097111 21756 net.cpp:380] data -> data         
I0703 11:10:54.097158 21756 net.cpp:380] data -> label
I0703 11:10:54.097657 21756 data_layer.cpp:45] output data size: 50,1,1,193536
I0703 11:10:54.097937 21756 net.cpp:122] Setting up data     
I0703 11:10:54.097960 21756 net.cpp:129] Top shape: 50 1 1 193536 (9676800)
I0703 11:10:54.097983 21756 net.cpp:129] Top shape: 50 (50)
I0703 11:10:54.097999 21756 net.cpp:137] Memory required for data: 38707400
I0703 11:10:54.098014 21756 layer_factory.hpp:77] Creating layer label_data_1_split
I0703 11:10:54.098047 21756 net.cpp:84] Creating Layer label_data_1_split
I0703 11:10:54.098063 21756 net.cpp:406] label_data_1_split <- label
I0703 11:10:54.098084 21756 net.cpp:380] label_data_1_split -> label_data_1_split_0             
I0703 11:10:54.098106 21756 net.cpp:380] label_data_1_split -> label_data_1_split_1                                                                                  
I0703 11:10:54.098131 21756 net.cpp:122] Setting up label_data_1_split
I0703 11:10:54.098145 21756 net.cpp:129] Top shape: 50 (50)                                                                                                                                                
I0703 11:10:54.098163 21756 net.cpp:129] Top shape: 50 (50)
I0703 11:10:54.098176 21756 net.cpp:137] Memory required for data: 38707800
I0703 11:10:54.098188 21756 layer_factory.hpp:77] Creating layer conv1_3
I0703 11:10:54.098212 21756 net.cpp:84] Creating Layer conv1_3
I0703 11:10:54.098227 21756 net.cpp:406] conv1_3 <- data      
I0703 11:10:54.098245 21756 net.cpp:380] conv1_3 -> conv1_3
F0703 11:10:54.098325 21756 blob.cpp:32] Check failed: shape[i] >= 0 (-1 vs. 0)
*** Check failure stack trace: ***                           
Aborted (core dumped)   

Opening the lmdb I've created and using stat() on it produced:

{'branch_pages': 1,
 'depth': 2,
 'entries': 12651,
 'leaf_pages': 75,
 'overflow_pages': 561233,
 'psize': 4096}

Searching the internet gave me a slight idea that perhaps i saved the processed images wrong. Any further ideas?

PS. I am very new to caffe, neuron networks etc. so i might even be missing the simplest of things.


Solution

  • You saved your intermediate features into lmdb file ('/home/user/yaw_db/test/lmdb').
    The data there is stored as a collection of 1x1x193,536 dimensional features. You are reading a batch of 50 each time. You can see this in your log file:

    I0703 11:10:54.097657 21756 data_layer.cpp:45] output data size: 50,1,1,193536 
    

    Now it seems like you are trying to apply a 3x3 convolution (at layer 'conv1_3'). However, the spatial dimensions of your input blob are 1x193,536. There's not enough "height" to the input blob to allow for 3x3 convolution, this is why you get an error

    F0703 11:10:54.098325 21756 blob.cpp:32] Check failed: shape[i] >= 0 (-1 vs. 0)