Search code examples
neural-networkdeep-learningcaffelmdb

Reading encoded image data from lmdb database in caffe


I am relatively new to using caffe and am trying to create minimal working examples that I can (later) tweak. I had no difficulty using caffe's examples with MNIST data. I downloaded image-net data (ILSVRC12) and used caffe's tool to convert it to an lmdb database using:

$CAFFE_ROOT/build/install/bin/convert_imageset -shuffle -encoded=true top_level_data_dir/ fileNames.txt lmdb_name

To create an lmdb containing encoded (jpeg) image data. The reason for this is that encoded, the lmdb is about 64GB versus unencoded being about 240GB.

My .prototxt file that describes the net is minimal (a pair of inner product layers, mostly borrowed from the MNIST example--not going for accuracy here, I just want something to work).

name: "example"
layer {
  name: "imagenet"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "train-lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "imagenet"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  transform_param {
    scale: 0.00390625
  }
  data_param {
    source: "test-lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  param {
    lr_mult: 1
  }
  param {
    lr_mult: 2
  }
  inner_product_param {
    num_output: 1000
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}

When train-lmdb is unencoded, this .prototxt file works fine (accuracy is abysmal, but caffe does not crash). However, if train-lmdb is encoded then I get the following error:

data_transformer.cpp:239] Check failed: channels == img_channels (3 vs. 1)

Question: Is there some "flag" I must set in the .prototxt file that indicates that the train-lmdb is encoded images? (The same flag would likely have to be given to for the testing data layer, test-lmdb.)

A little research:

Poking around with google I found a resolved issue which seemed promising. However, setting the 'force_encoded_color' to true did not resolve my problem.

I also found this answer very helpful with creating the lmdb (specifically, with directions for enabling the encoding), however, no mention was made of what should be done so that caffe is aware that the images are encoded.


Solution

  • The error message you got:

    data_transformer.cpp:239] Check failed: channels == img_channels (3 vs. 1)
    

    means caffe data transformer is expecting input with 3 channels (i.e., color image), but is getting an image with only 1 img_channels (i.e., gray scale image).

    looking ar caffe.proto it would seems like you should set the parameter at the transformation_param:

    layer {
      name: "imagenet"
      type: "Data"
      top: "data"
      top: "label"
      include {
        phase: TRAIN
      }
      transform_param {
        scale: 0.00390625
        force_color: true  ##  try this
      }
      data_param {
        source: "train-lmdb"
        batch_size: 100
        backend: LMDB
        force_encoded_color: true  ## cannot hurt...
      }
    }