I have 2 sets of image patches data i.e. training and testing sets. Both of these have been written to LMDB files. I am running convolutional neurall network on this data using Caffe.
The problem is that the data stored on hard disk is occupying considerable amount of space and is hampering my efforts to introduce more training data with deliberate noise addition to make my model more robust.
Is there a way where I can send image patches from my program directly to the CNN (in Caffe) without storing them in LMDB? I am currently using python to generate patches from the images for the training data set.
You can write your own python data layer. See discussions here and implementation for of input data layer for video stream here.
Basically you will need add to you network description layer like:
layer {
type: 'Python'
name: 'data'
top: 'data'
top: 'label'
python_param {
# the module name -- usually the filename -- that needs to be in $PYTHONPATH
module: 'filename'
# the layer name -- the class name in the module
layer: 'CustomInputDataLayer'
}
}
and implement the layer interface in Python:
class CustomInputDataLayer(caffe.Layer):
def setup(self):
...
def reshape(self, bottom, top)
top[0].reshape(BATCH_SIZE, your_data.shape)
top[1].reshape(BATCH_SIZE, your_label.shape)
def forward(self, bottom, top):
# assign output
top[0].data[...] = your_data
top[1].data[...] = your_label
def backward(self, top, propagate_down, bottom):
pass