Search code examples
neural-networkdeep-learninghdf5caffelmdb

Caffe HDF5 Pre-processing


I am getting started with Caffe and Deep learning and I am not able to understand what are the required pre-processing steps to train a model using Caffe on HDF5 data. Specifically,

  1. Is it required to convert the image into [0-1] range. The notebook example (00-classification.ipynb) states that the model operates in [0-255] range while some of the references show that it should be [0-1]. How do I decide this?
  2. As per the documentation, the conventional blob dimensions for batches of image data is N x channel K x height H x width W. There are no conflicts on this
  3. Channel swap step for RGB to BGR conversion is mandatory ?
  4. How to perform image mean computation for HDF5 data? For compute_image_mean.cpp, the backend is lmdb. This is only for improving performance?

As for the use of LMDB, questions 1-3 still hold. Any clarification on this will be highly appreciated.


Solution

  • Welcome to caffe.

    1. Scaling the input data to the range of [0..1] or [0..255] is entirely up to you. Some models works in [0..1] range, others in [0..255] and it is completely unrelated to the choice of input method (LMDB/HDF5).
    The most important thing here is being consistent. If you decided to work in the range [0..1] you must make sure both training and validation sets are prepared in the same manner and that new examples during "deploy" phase are scaled to the same range.

    2. Caffe blobs are always 4-D batch-channel-width-height as you already observed.

    3. RGB to BGR is again not mandatory but very common since BGR is the way opencv reads images. Again, the most important thing here is consistency throughout the life cycle of your net.

    4. Recently, models subtract mean per channel, rather than mean per pixel. It is more convenient, especially if you change the input size of your net. When processing the HDF5 data you can compute the mean image and save it into a binaryproto. See an example here.