Search code examples
matlabcaffematcaffe

How to adapt Caffe Matlab wrapper for a network trained on Mnist?


I successfully trained my Caffe net on the mnist database following http://caffe.berkeleyvision.org/gathered/examples/mnist.html

Now I want to test the network with my own images using the Matlab wrapper.

Therefore in "matcaffe.m" im loading the file "lenet.prototxt" which is not used for training but which seems to be suited for testing. It is referencing a input size of 28 x 28 pixels:

name: "LeNet"
input: "data"
input_dim: 64
input_dim: 1
input_dim: 28
input_dim: 28
layer {
name: "conv1"
type: "Convolution"
bottom: "data"
top: "conv1"

Therefore I adapted the "prepare_image" function in "matcaffe.m" accordingly. It now looks like this:

% ------------------------------------------------------------------------
function images = prepare_image(im)
IMAGE_DIM = 28;
% resize to fixed input size
im = rgb2gray(im);
im = imresize(im, [IMAGE_DIM IMAGE_DIM], 'bilinear');
im = single(im);
images = zeros(1,1,IMAGE_DIM,IMAGE_DIM);
images(1,1,:,:) = im;
images = single(images);
%-------------------------------------------------------------

This converts the input image to a [1 x 1 x 28 x 28], 4dim, grayscale image. But still Matlab is complaining:

Error using caffe
MatCaffe input size does not match the input size of the
network
Error in matcaffe_myModel_mnist (line 76)
scores = caffe('forward', input_data);

Does somebody have experience with testing the trained mnist net on his own data?


Solution

  • Finally I found the full solution: This how to predict a digit of your own input image using the matcaffe.m (Matlab wrapper) for Caffe

    1. In "matcaffe.m": One has to reference the file "caffe-master/examples/mnist/lenet.prototxt"
    2. Adapt the file "lenet.prototxt" as pointed out by mprat: Change the entry input_dim to input_dim: 1
    3. Use the follwing adaptation to the subfunction "prepare_image" in matcaffe.m:

    (Input can be an rgb image of any size)

    function image = prepare_image(im)
    
    IMAGE_DIM = 28;
    
    % If input image is too big , is rgb and of type uint8:
    % -> resize to fixed input size, single channel, type float
    
    im = rgb2gray(im);
    im = imresize(im, [IMAGE_DIM IMAGE_DIM], 'bilinear');
    im = single(im);
    
    % Caffe needs a 4D input matrix which has single precision
    % Data has to be scaled by 1/256 = 0.00390625 (like during training)
    % In the second last line the image is beeing transposed!
    images = zeros(1,1,IMAGE_DIM,IMAGE_DIM);
    images(1,1,:,:) = 0.00390625*im';
    images = single(images);